Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Question gzip Maven Jean-loup Gailly

Posted by Roblimo on Mon Mar 06, 2000 11:00 AM
from the picking-the-experts'-brains dept.
Jean-loup Gailly is the author of gzip and, now, CTO for Mandrakesoft, purveyors of Linux-Mandrake. Jean-loup's home page tells you quite a bit about him, including some interesting peeks into his life beyond Linux and open source software. Please try to keep it down to one question per post. Submitted questions will be chosen from the highest-moderated. Answers will appear within the next week.
This discussion has been archived. No new comments can be posted.
Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1) | 2
  • Hardware deflation? by Anonymous Coward (Score:1) Monday March 06 2000, @08:05AM
  • Re:LinuxOne by whoop (Score:1) Monday March 06 2000, @06:58AM
  • CPIO is used by lots of applications. by emil (Score:1) Monday March 06 2000, @11:21AM
  • Re:Why do you force the use of TAR? by klasker (Score:1) Friday March 10 2000, @11:33AM
  • Mandrake in a Fishbowl by ggoebel (Score:1) Friday March 10 2000, @12:19PM
  • Hrumph... by ggoebel (Score:1) Friday March 10 2000, @10:44AM
  • Re:LMAO by Denny (Score:1) Tuesday March 07 2000, @01:41AM
  • Here's one we made earlier! by Denny (Score:1) Monday March 06 2000, @06:41AM
  • Iagno is not Go... by deltab (Score:1) Monday March 06 2000, @07:51AM
  • next-generation zlib? by Cave Newt (Score:1) Monday March 06 2000, @08:03PM
  • multithreaded gzip by zoot (Score:1) Monday March 06 2000, @02:14PM
  • Re:Improved compression? by Tom Womack (Score:1) Tuesday March 07 2000, @12:01AM
  • Re:What About Ada? by Tom Womack (Score:1) Tuesday March 07 2000, @12:06AM
  • Re:multithreaded gzip by Tom Womack (Score:1) Tuesday March 07 2000, @12:11AM
  • Mandrake 7.0 Installer by Espressoman (Score:1) Monday March 06 2000, @01:20PM
  • Re:Nasty Code by sholden (Score:1) Monday March 06 2000, @10:50AM
  • Re:Why do you force the use of TAR? by EZ-G (Score:1) Monday March 06 2000, @09:38AM
  • Question by XtBart (Score:1) Monday March 06 2000, @07:38AM
  • Re:Patent issue by ChadN (Score:1) Monday March 06 2000, @04:27PM
  • Linux-Mandrake and GNOME by DrFickle (Score:1) Monday March 06 2000, @03:29PM
  • What About Ada? by Kartoffel (Score:1) Monday March 06 2000, @12:39PM
  • Re:LZW/".Z" decompression not covered by patent? by Zurk (Score:1) Monday March 06 2000, @02:50PM
  • Re:Improved compression? by Trojan (Score:1) Tuesday March 07 2000, @02:54PM
  • Re:Why do you force the use of TAR? by steve9000 (Score:1) Monday March 06 2000, @11:54AM
  • Re:Cross platform Mandrake by Afrosheen (Score:1) Monday March 06 2000, @09:13PM
  • Re:Mandrake 7.0 Installer by Afrosheen (Score:1) Monday March 06 2000, @09:15PM
  • zLib/zip/gzip in Closed Source software by XenonOfArcticus (Score:1) Monday March 06 2000, @08:49AM
  • Re:This is actually a serious question! by Eponymous, Showered (Score:1) Monday March 06 2000, @12:19PM
  • Re:Why do you force the use of TAR? by /ASCII (Score:1) Monday March 06 2000, @10:23AM
  • LMAO by Esperandi (Score:1) Monday March 06 2000, @03:59PM
  • Re:Improved compression? by Esperandi (Score:1) Monday March 06 2000, @04:08PM
  • This is actually a serious question! by cxd204 (Score:1) Monday March 06 2000, @07:53AM
  • MandrakeSoft and the Nasdaq by lascars (Score:1) Monday March 06 2000, @08:24PM
  • Corel vs Mandrake by smash_phase (Score:1) Monday March 06 2000, @12:50PM
  • Re:Why do you force the use of TAR? by Jonathan the Nerd (Score:1) Monday March 06 2000, @05:06PM
  • Re:This is actually a serious question! by Maïdjeurtam (Score:1) Monday March 06 2000, @11:32AM
  • Re:This is actually a serious question! by Maïdjeurtam (Score:1) Monday March 06 2000, @12:32PM
  • arithmetic coding by dbsears (Score:1) Monday March 06 2000, @06:51AM
  • Re:Mandrake for i386? [off-topic] by grolim13 (Score:1) Wednesday March 08 2000, @01:44AM
  • Re:What about wavelets? by kalifa (Score:1) Monday March 06 2000, @08:43AM
  • Re:What About Ada? by opk (Score:1) Sunday March 12 2000, @07:12AM
  • Linux for Everyman by caddmannq (Score:1) Monday March 06 2000, @10:52AM
  • AMD K6 and AMD K6-2 Memory B Stepping issue by hipdad (Score:1) Monday March 06 2000, @04:34PM
  • Re:AMD K6 and AMD K6-2 Memory B Stepping issue by hipdad (Score:1) Thursday March 09 2000, @04:09PM
  • LZW/".Z" decompression not covered by patent? by Anonymous Coward (Score:2) Monday March 06 2000, @09:17AM
  • Re:Bzip2 stability by jd (Score:2) Monday March 06 2000, @07:04AM
  • Bzip2 stability by Uruk (Score:2) Monday March 06 2000, @06:59AM
  • LinuxOne/Mandrake by Uruk (Score:2) Monday March 06 2000, @07:04AM
  • Big Money, Big Ideas. by SgtPepper (Score:2) Monday March 06 2000, @07:59AM
  • Re:arithmetic coding by Spud Zeppelin (Score:2) Monday March 06 2000, @07:21AM
  • compression algorithms by thingie (Score:2) Monday March 06 2000, @08:40AM
  • Patent issue by Straker Skunk (Score:2) Monday March 06 2000, @07:14AM
  • Mandrake for i386? by Seth Cohn (Score:2) Monday March 06 2000, @07:08AM
  • Re:Compression software by harmonica (Score:2) Tuesday March 07 2000, @05:34AM
  • Re:Improved compression? by Glenn R-P (Score:2) Tuesday March 07 2000, @04:10AM
  • Re:Improved compression? by Esperandi (Score:2) Monday March 06 2000, @05:32PM
  • Re:Improved compression? by Esperandi (Score:2) Wednesday March 08 2000, @02:04PM
  • Linux Start-Ups in France by viko (Score:2) Monday March 06 2000, @12:41PM
  • Re:Improved compression? by Signail11 (Score:2) Monday March 06 2000, @04:32PM
  • Re:Improved compression? by Signail11 (Score:2) Tuesday March 07 2000, @06:33AM
  • Cross platform Mandrake by Capt. DrunkenBum (Score:2) Monday March 06 2000, @07:18AM
  • Mandrake and Netware by Mad-cat (Score:2) Monday March 06 2000, @07:05AM
  • Improved compression? by br4dh4x0r (Score:2) Monday March 06 2000, @07:02AM
  • gzip in a resource-rich environment by Morbid Curiosity (Score:2) Monday March 06 2000, @06:58AM
  • by Anonymous Coward on Monday March 06 2000, @08:31AM (#1222742)
    I don't use gzip because of what I see as a major annoyance - it only gzips single files, so you have to use the ancient tar.

    Using tar makes things unnecessarily complicated. There is less support for tar around on non-UNIX platforms, and 'embedded compression/archiving' seems to cause great trouble for newbies who can just about handle WinZip and nothing more.

    If gzip is to become a truly viable alternative to patented zip, I think the .tar.gz should become a thing of the past.

    Remove the old legacy tape archive!

  • by emil (695) on Monday March 06 2000, @06:15AM (#1222743) Homepage

    Would you think it wise to roll alternatives to the Lempel-Ziv algorithms into gzip to make other compression utilities less attractive?

    It seems that this approach is adopted by other applications (ssh uses multiple encryption engines, and TIFF has allowed several compression techniques for quite a long time).

    Would you support an effort to implement bzip2 within gzip? Do you think such a thing could be done while maintaining gzip's stability?

  • Go! (Score:3)

    by Denny (2963) <`slashdot' `at' `ukfetish.info'> on Monday March 06 2000, @07:01AM (#1222744) Homepage Journal

    I notice you are a keen Go player... the GNOME version of Go (Iagno) seems much more attractive to me than the KDE version (kgo). I was wondering what software you use to play games, or are you not really interested in the interface at your level of play?

    Regards,
    Denny

    # Using Linux in the UK? Check out Linux UK [linuxuk.co.uk]

  • Astronomical! :) (Score:3)

    by Denny (2963) <`slashdot' `at' `ukfetish.info'> on Monday March 06 2000, @07:08AM (#1222745) Homepage Journal

    On your website, in the history section, you have a link to some information about pulsars...

    Were you an astronomy student, and if so how did you go from studying pulsars to CTO of a major Linux distributor?!?

    Regards,
    Denny

    # Using Linux in the UK? Check out Linux UK [linuxuk.co.uk]

  • LinuxOne (Score:3)

    by fReNeTiK (31070) on Monday March 06 2000, @06:31AM (#1222746)
    I have read about an agreement between Mandrake and LinuxOne to create a chinese Linux development center. Did any good come out of this? Couldn't Mandrake's otherwise excellent reputation be damaged by such relationships?

  • Nasty Code (Score:3)

    by FigBug (69370) on Monday March 06 2000, @09:07AM (#1222747) Homepage
    Why do you write code like this:

    z = (z = g - w) > (unsigned)l ? l : z;

    It makes your code almost impossible to read. Do you even know what this line does anymore?

  • Re:Nasty Code (Score:3)

    by Signail11 (123143) on Monday March 06 2000, @10:27AM (#1222748)
    z = (z = g - w) > (unsigned)l ? l : z;
    I hate to sound like I'm flaming you, but this is the standard idiom in C for addition with saturation. When (g-w) is larger than a certain constant l, z is assigned to that constant l, otherwise, z will retain its value.
    This code can also be written less efficiently (well, at least if your compiler doesn't have common sub-expression elimination) as:
    if((g-w) > (unsigned) l){
    z=l;
    } else {
    z=g-w;
    }
  • by Signail11 (123143) on Monday March 06 2000, @11:59AM (#1222749)
    The answer to your question is no. Very briefly, it is not possible to build a universal compressor that can reduce the size of all possible inputs, nor is it possible for a compressor to emit an output with less information content (ie. Shannon entropy) than the input. It is not possible to have a compressor that takes in, say all 1MB text files, and always outputs 10k compressed files simply because most 1MB text files contain more information than can be expressed in 10k.

    A much more intuitive argument is the "pigeonhole principle." Let's assume that there are 16 holes in a wall, to which each is associated with a message. It is impossible for 17 messages to each be uniquely associated with a hole because there are not enough holes avalible. A 4-bit file can only represent 16 different messages, regardless of what algorithm is used to compress the message...unless, that is, you don't care about the compression being reversible!
  • If you could compress anything and put it in your pocket what would you choose and why?
  • by sparkes (125299) on Monday March 06 2000, @06:27AM (#1222751) Homepage Journal
    I see from your homepage you avoided patents when writing gzip, how do you feel about the current explosion of software related patents?
  • by emil (695) on Monday March 06 2000, @06:34AM (#1222752) Homepage

    I guess that you have at least a little something to say about this.

    Is the 586 optimization enough to justify Mandrake's position? Are you especially proud of any of the architectural differences between the distributions (from what I have been told, the Apache-PHP layout is quite a bit different).

    How do feel about the steps that Red Hat has taken to change their distribution in reaction to yours?

  • by Tom Womack (8005) <tom@womack.net> on Monday March 06 2000, @07:09AM (#1222753) Homepage

    The Data Compression Book was an excellent reference when it came out, but there are some hot topics in compression that it doesn't cover - frequency-domain lossy audio techniques (MP3), video techniques (MPEG2 and especially MPEG4), wavelets (Sorenson video uses these, I believe, and JPEG2000 will), and the Burrows-Wheeler transform from bzip.

    Do you have any plans for a new edition of the book, or good Web references for these techniques? BZip is covered well by a Digital research note, but documentation for MPEG2 seems only to exist as source code and I can't find anything concrete about using wavelets for compression. The data is all there on the comp.compression FAQ, but the excellent exposition of the book is sorely lacking.
  • Go and Compression (Score:4)

    by Inquisiter (155042) on Monday March 06 2000, @07:59AM (#1222754)
    When I think of a game like go or chess, I think that each player develops there own algorithm to beat their opponent. If you agree, what relationships or similarities do you see between your intrest in Go and your intrest in compression?

    Inquiring minds want to know.
  • bzip2 Support (Score:5)

    by Aaron M. Renn (539) <arenn@urbanophile.com> on Monday March 06 2000, @06:09AM (#1222755) Homepage
    When is gzip going to provide (transparent) support for bzip2 files and the Burrows-Wheeler algorithm?

    Will BW be an algorithm option within the gzip file format itself ever?

  • by jd (1658) <[imipak] [at] [yahoo.com]> on Monday March 06 2000, @07:13AM (#1222756) Homepage Journal
    It is a "truism" in the Free Software community that code should be released early and released often.

    However, much of the software you've written has started gathering a few grey hairs. Gzip, for example, has been at 1.2.4 for many, many moons, and looks about ready to collect it's gold watch.

    Is compression software in a category that inherently plateus quickly, so that significant further work simply isn't possible? Or is there some other reason, such as Real Life(tm) intruding and preventing any substantial development?

    (I noticed, for example, a patch for >4Gb files for gzip, which could have been rolled into the master sources to make a 1.2.5. This hasn't happened.)

  • Winzip (Score:5)

    by Uruk (4907) on Monday March 06 2000, @06:55AM (#1222757)
    I noticed that you allowed the people who make the Winzip product to incorporate code written for Gzip. I think it's cool that you did that, because it would be horrible if winzip couldn't handle the gzip format, but at the same time, what are your thoughts about allowing free software code to be included in closed-source products?

    Just out of curiosity, (tell me it's none of my business if you want to and I'll be OK with that) did you receive a licensure fee from the company that makes Winzip for the code?

  • by Tom Womack (8005) <tom@womack.net> on Monday March 06 2000, @07:15AM (#1222758) Homepage
    The field of compression has been thronged with patents for a long time - but patents at least reveal the algorithm.

    What do you think of the expansion of trade-secret algorithms (MP3 quantisation tables, Sorensen, RealAudio and RealVideo, Microsoft Streaming Media) where the format of the data stream is not documented anywhere?

    Tom
  • by Stephen (20676) on Monday March 06 2000, @06:26AM (#1222759) Homepage
    The compression world has many patents, notably for Lempel-Ziv compression as used in GIF. What is your view on companies patenting non-obvious algorithms for such processes as data compression?
  • by drudd (43032) on Monday March 06 2000, @06:16AM (#1222760)
    I am a happy owner of The Data Compression Book (2nd Ed). With the increasing availability of compression routines within libraries (Java's GZIP streams spring to mind), does this make your book a little unnecessary?

    Should software authors continue to write their own compression routines, or simply trust the versions available to them in library form?

    I can see some definite advantages to library code, i.e. the ability to upgrade routines, and having standardized algorithms which can be read by any program which utilizes the library.

    Doug
  • As we all know, at first Mandrake was little more than a repackaged version of RedHat. That's changed a bit with the newer versions. My question is this: to what degree will Mandrake continue to differ from RedHat and will there ever be a "developer" version (i.e. one that is centered towards those who are a bit more technically competant)?

    Brad Johnson
    --We are the Music Makers, and we
    are the Dreamers of Dreams
  • 11 replies beneath your current threshold.
(1) | 2