|
Home > Archive > Debian Developers > December 2007 > New field in binary stanza
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
New field in binary stanza
|
|
| David Paleino 2007-12-24, 1:25 pm |
| | |
| Neil Williams 2007-12-24, 1:25 pm |
| On Mon, 24 Dec 2007 16:51:13 +0100
David Paleino <d.paleino@gmail.com> wrote:
> Hi *,
> would it be possible to have a "License" field in the information of a package?
Why? What is the benefit?
A machine-interpretable format for debian/copyright is already
available. Why clutter the dpkg and apt-cache with licence lines?
We already have main vs contrib vs non-free. Why subdivide main?
Some packages can have multiple (compatible) licences - the details of
what is licenced under which can only be properly determined by reading
debian/copyright. It's installed for every package so I don't see the
point.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
| |
| David Paleino 2007-12-24, 1:25 pm |
| | |
| Julien Cristau 2007-12-24, 1:25 pm |
| On Mon, Dec 24, 2007 at 16:51:13 +0100, David Paleino wrote:
> Hi *,
> would it be possible to have a "License" field in the information of a package?
> I mean, "apt-cache show foo" shows the fields defined in debian/control and
> some others. Would it be possible to parse the license from debian/copyright
> and add it to that info? Or, at least, give the chance to developers to use
> something like XS-License: in the source stanza of debian/control?
>
So the Packages file would be as big as the combination of all
debian/copyright files in the archive?
Cheers,
Julien
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| David Paleino 2007-12-24, 1:25 pm |
| | |
| Daniel Brumbaugh Keeney 2007-12-24, 1:25 pm |
| On Dec 24, 2007 11:07 AM, David Paleino <d.paleino@gmail.com> wrote:
> If the license is free, but it's not a "standard" one, one could always write:
>
> License: see debian/copyright.
> David
That seems unnecessary, being the effective default.
Daniel Brumbaugh Keeney
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| David Paleino 2007-12-24, 1:25 pm |
| | |
| Neil Williams 2007-12-24, 1:25 pm |
| On Mon, 24 Dec 2007 17:36:06 +0100
David Paleino <d.paleino@gmail.com> wrote:
>
> debian/copyright is not available via the APT cache, thus cannot be available
> to wrappers like python-apt and others.
So? Why is knowing the licence important before installation anyway?
It's in main, it's free software. It's not in main, a one-line addition
in the dpkg output is not going to tell you much about why.
>From a user perspective, there is no difference between any package in
main as far as a licence is concerned.
>
> Again, it's not about _installed_ packages, but about fetching this information
> from the APT cache (i.e. can't install packages on Alioth just to read
> debian/copyright...), or any other place that won't require root privileges
> (debian/copyright is online, but I believe that parsing it might be kind of a
> nightmare, if one wants to give a "standardized" output).
I can't see any point in having such output available to the user.
If you want this data, write a dedicated wrapper - don't burden
everyone else with an extra 20,000 lines in Packages.gz - create a
local mirror if necessary.
> [1] http://debian-med.alioth.debian.org/tasks/bio.php
So for the sake of one webpage, every Debian user gets yet more bloat
in Packages.gz. Oh good. Sorry, I think that's a really really really
bad idea. Almost makes me think it's 1st April.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
| |
| Stefano Zacchiroli 2007-12-24, 1:25 pm |
| On Mon, Dec 24, 2007 at 04:51:13PM +0100, David Paleino wrote:
> would it be possible to have a "License" field in the information of a package?
I understand your need, but in this case (as opposed to the others you
mention) I believe a new field is not the right solution. The reason is
that in the general case too many information would need to be encoded
in such a field; that's why a machine interpretable copyright format has
been proposed [1].
To avoid bloating the Sources (see other replies) the only possible way
in between would be to have such a field only for "simple cases" (e.g.
GPL-only packages). But I'm way in favour of no information over partial
information.
Maybe the related question is: once the debian/copyright format is
widespread enough, how can we make such an information available
archive-wide mechanically?
[1] http://wiki.debian.org/Proposals/CopyrightFormat
--
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{cs.unibo.it,debian.org,bononia.it} -%- http://www.bononia.it/zack/
(15:56:48) Zack: e la demo dema ? /\ All one has to do is hit the
(15:57:15) Bac: no, la demo scema \/ right keys at the right time
| |
| David Paleino 2007-12-24, 1:25 pm |
| | |
| David Paleino 2007-12-24, 1:25 pm |
| | |
| Neil Williams 2007-12-24, 7:20 pm |
| On Mon, 24 Dec 2007 18:43:57 +0100
David Paleino <d.paleino@gmail.com> wrote:
>
> It's not for users, it's for developers.
But you cannot separate the content of the Packages.gz file according
to whether the viewer is a user or developer. What goes into the
Packages file gets sent to everyone using main, whether they need the
licence data or not.
Now it might be nice to change that so that some users (all developers
are also users so that covers everyone) can have a minimal apt cache
download and Packages.gz file, some may want a maximal cache. However,
that isn't going to happen quickly and adding more fields is not the
way to seek it.
I'd support that because Emdebian could do with a smaller Packages.gz
file - maybe miniPackages.gz alongside Packages.gz. It's relatively easy
- a parser built into the repository tools to strip out certain fields
like Priority, Section, maybe even drop Maintainer and Homepage for
certain embedded devices that won't have a functional 'reportbug' on
the device itself. Then an option for apt (in /etc/apt/apt-conf.d/
IIRC) that uses this file for such devices.
There has been discussion on having translation status in Packages.gz
too - Emdebian has an alternative to that but it may be something that
could be added to max_Packages.gz etc. for those devices where space
and bandwidth are not an issue.
>
> I'll do, if necessary.
I'd say it is necessary.
It would go a long way to restarting the whole issue of machine
interpreted copyright info and encouraging other developers to support
it. (I've converted some of my packages but without a visible benefit
of doing so, I'm not exactly rushing to convert the rest.)
> First of all, it's not just that webpage. Go to /tasks/, and you'll see that
> there are others. But the most important thing is that this will probablybe a
> feature of cdd-dev; probably in future other CDDs (Debian-Edu, ...) will also
> use that. From this point of view, Debian-Med is kind of a "prototype" for
> cdd-dev.
I still can't see why the licence is relevant to such a website /
interface / task list. The package is in main - that's all you need to
know until you need to copy code from that package into your own.
Redistributing anything in main is explicitly allowed without any need
for any other information.
> I repeat, if it's needed, I'll write a parser for this. I just wanted to have
> the chance not to write a huge script that does the job.
Instead you want to add data to everyone's apt cache - that isn't a
good deal for everyone else. A parser is the sane solution.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
| |
| Russ Allbery 2007-12-24, 7:20 pm |
| David Paleino <d.paleino@gmail.com> writes:
> would it be possible to have a "License" field in the information of a
> package? I mean, "apt-cache show foo" shows the fields defined in
> debian/control and some others. Would it be possible to parse the
> license from debian/copyright and add it to that info? Or, at least,
> give the chance to developers to use something like XS-License: in the
> source stanza of debian/control?
You cannot reduce the licensing status of a package to a small number of
keywords in many cases. It's really only possible for packages covered by
the "big licenses" like the GPL. Even with relatively standardized
licenses like MIT or BSD, there are slight variations in wording between
different packages.
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Michael Tautschnig 2007-12-25, 7:33 am |
| [...]
>
> Well, my proposal was for an optional field: who wants it, uses it.
>
What about debtags? Wouldn't this be most appropriate?
- It's optional.
- It's available from the apt-cache.
- No need to change dpkg, policy, etc.
Best,
Michael
| |
| Stefano Zacchiroli 2007-12-25, 7:33 am |
| On Mon, Dec 24, 2007 at 06:52:12PM +0100, David Paleino wrote:
> First of all, thank you for the kind reply. It seemed like the
> Christmas spirit has been blown away from this list.
Thank you for noticing, I still hope that exchanges like this have the
power of improving in the long run the debate quality on -devel.
> Well, my proposal was for an optional field: who wants it, uses it.
Well, what's for then?
Additional informative fields which are targeted to users (like for
example the Homepage one) are indeed useful if present and do no harm
elsewhere, but if this is assumed as developer targeted than (IMO of
course) you really have the need of precise information which can't be
grasped in the general case by such a field.
> Anyway, I'm seeing that what I'm telling now has already been proposed
> for debian/copyright. The problem is still there though: the chance to
> see some information about the license of not installed packages not
> being connected to the Internet.
Yup, got it, but I'm convinved this can be solved stepwise. The first
(big!) step is to get the new debian/copyright format widespread in the
archive. (I'm convinced that that won't happen until we have tools to
process it, that's why on my personal todo list there is a parser for
the format to be integrated into python-debian, but time is always
lacking ...)
Once we have that I'm convinced it won't be feasible to have the
information embedded in the apt-cache, for the reasons of size bloat
expressed by other.
Nevertheless, we can imagine having for example a daily updated big
tarball of debian/copyrights generated on the mirrors, and then a patch
for aptitude which downloads it if requested and than implements the
ability to show it to the users. The technical details are not important
at this point, since first we need to spread the usage of the new
format.
Notice that in the simple cases (i.e. GPL only package) the
debian/copyright won't be much more complicated than the field you
propose; it will simply have a catch-all glob pattern "*" pointing to
the GPL.
> That might be an alternative. Is there any progress on the
> CopyrightFormat proposal? I can't find anything on the wiki.
None yet, and it's hardly possible to make stats until we have at least
a parser for the new format. Once we have that once can imagine setting
up a statistics page showing how many packages in the archive are using
such a format, but not before.
Cheers.
--
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{cs.unibo.it,debian.org,bononia.it} -%- http://www.bononia.it/zack/
(15:56:48) Zack: e la demo dema ? /\ All one has to do is hit the
(15:57:15) Bac: no, la demo scema \/ right keys at the right time
| |
| David Paleino 2007-12-25, 7:33 am |
| | |
| Enrico Zini 2007-12-25, 1:25 pm |
| On Mon, Dec 24, 2007 at 06:39:22PM +0100, Stefano Zacchiroli wrote:
> Maybe the related question is: once the debian/copyright format is
> widespread enough, how can we make such an information available
> archive-wide mechanically?
Easy: apt-xapian-index.
It works like this:
1. Define what kind of searches you want to allow people to do
2. Define what kind of information you need to index for those searches
3. Create a tool that collects the information you need to index and
publishes them somewhere on the net
4. Create a tool that downloads the data into a system and feeds it
into apt-xapian-index, by installing a plugin in
/usr/share/apt-xapian-index/plugins
5. Create an interface that takes advantage of this
6. Profit.
Debtags works in this way, although it also supports getting tags from
the packages file.
I've already worked out steps 1 to 3 for popcon information and
iterating.com package ratings, and sooner or later I intend to work on
step 4 [1].
Now, with regards to licenses, the question is whether steps 1 and 2 can
be attacked in any useful way. I'll write a separate message about
that.
Ciao,
Enrico
[1] If anyone would like it sooner, I'm happy to assist them doing
it.
--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>
| |
| Enrico Zini 2007-12-25, 1:25 pm |
| On Mon, Dec 24, 2007 at 06:52:12PM +0100, David Paleino wrote:
> Anyway, I'm seeing that what I'm telling now has already been proposed for
> debian/copyright. The problem is still there though: the chance to see some
> information about the license of not installed packages not being
> connected to the Internet.
This is solvable by packaging a file with the data extracted from the
archive: the information will then end up in the CD. I do that for
debtags.
> Well, most of Debian packages have simple licenses (see: GPL, BSD, MIT). And,
> again, the field would be totally optional.
In the other mail I sent to this thread I was showing the steps that
could be followed to implement this with apt-xapian-index. The tricky
parts are the first and second step:
1. Define what kind of searches you want to allow people to do
2. Define what kind of information you need to index for those searches
I mentioned that these steps might not be possible to be attacked in a
useful way. To understand why I say this, consider:
- the variety of licenses we have in the archive
- that different bits of a package can have different licenses
- that the copyright file applies to the source package but the search
probably happens on binary packages.
I had a look at http://wiki.debian.org/Proposals/CopyrightFormat, and I
strongly endorse that proposal. The 'License:' field proposed there
looks like it's the best data source for this. However, if more than
something like 20% of the packages in the archive end up having
'License: other', in my experience that field risks to end up being
useless for searches.
Consider also this scenario:
Source package foo contains a debian/copyright file that says "the
library is LGPL, the executable tools are GPL, the examples are WTFPL,
the debian packaging is BSD-3"[1].
How should we handle it? I can think of two cases:
1. libfoo-dev only shows LGPL, libfoo-bin only shows GPL,
libfoo-examples only shows WTFPL. In this case, how do you sort
the various licenses into the binary packages? And also, where
did BSD-3 go?
2. All the binary packages list all the licenses. In this case,
when you search for WTFPL (or BSD-3) you end up with libfoo-dev,
libfoo-bin and loads of other false positives among the results.
I know it's easy to think "'License: GPL' is all I need", and I also
know it's easy to think "it's too much of a mess, it can't be done".
What is hard to think is "let's see what really can be done".
To really attack this problem, we need to have some statistics about
what really is the distribution of licenses around the archive, so we
really know what we're talking about. I suppose that starting to adopt
http://wiki.debian.org/Proposals/CopyrightFormat could be a good way to
make it possible to collect such statistics.
Another rather important thing that can be done at this stage is to
provide use cases for using the data, check if
http://wiki.debian.org/Proposals/CopyrightFormat provides enough
information to support those use cases, and in case something is missing
see if it can reasonably be added and how.
Ciao,
Enrico
[1] When CC-BY-SA 3.0 will be out, you can reasonably add "Documentation
is CC-BY-SA-3", and a libfoo-doc package to the list.
--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>
| |
| Enrico Zini 2007-12-25, 1:25 pm |
| On Tue, Dec 25, 2007 at 11:47:59AM +0100, Michael Tautschnig wrote:
> What about debtags? Wouldn't this be most appropriate?
> - It's optional.
> - It's available from the apt-cache.
> - No need to change dpkg, policy, etc.
Please see http://debtags.alioth.debian.org/fa...tags-in-debtags
I should update that entry by pointing at the other two messages I wrote
to this thread (or possibly, to the whole thread). In fact, a tag
cannot convey enough information to represent the licensing information
for a package.
Ciao,
Enrico
--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>
| |
| Magnus Holmgren 2007-12-28, 7:33 am |
| |
|
|
|
|