Hi Leif,
I added 2 pipes to buildin.py:
- publish_html creates static HTML views of IDPs and SPs, using XSLT based on Peter Schober’s alternative to MET;
- publish_split: similar to store, but added validUntil and creates signed XML-file per EntityDescriptor. This can be consumed dynamically by ADFS in an IDP role.
I put it directly into buildin.py because it shares some code with the sign pipe. Is this viable from your PoV - if yes, I would make an PR.
Cheers, Rainer
Hi all,
being part of Commons Conservancy brought up yet another subject,
which is whether we should add a header with license information in
every file in the projects under idpy. This is not something done in
an abstract way, there is a specific format modelling this information
(see https://spdx.org/ and https://reuse.software/ - more specifically
https://reuse.software/practices/2.0/) Still, I find it problematic.
We want to open up the question to the wider community and consider
their thoughts on this. The forwarded message below is discussing this
subject. You can see the question we posed, the answer we got and my
comments. Feel free to tell us what you think on this.
---------- Forwarded message ---------
Date: Thu, 16 May 2019 at 09:56
> ---------- Forwarded message ----------
> Date: May 8, 2019, 8:15 AM -0700
>
> > Why does CC think having a single license file per project is
> > insufficient? Our thought is that if we can avoid adding a header to
> > every single file, that would be nice, esp. given we already have this
> > info in the license file and we have the Note Well.
>
>
> this is not just our opinion, but something that is an industry and
> community standard for legal compliance these days. When companies like
> Siemens, Samsung or Honeywell use some code in one of the hundreds or
> thousands of devices and systems in their product line, they need to be
> able to provide the correct license and a download of the exact version.
> This means machine readability too.
>
I've actually observed the opposite of that. Communities abandon the
"license in every file" model, and just use a single LICENSE file in
the root of the project. The LICENSE file contains license
information, that is, it is not a single license but it has exception
sections and so on.
> To quote from https://reuse.software/practices/2.0/ :
>
> Scroll to the section "2. Include a copyright notice and license in each
> file"...
>
> "Source code files are often reused across multiple projects, taken from
> their origin and repurposed, or otherwise end up in repositories where
> they are separate from its origin. You should therefore ensure that all
> files in your project have a comment header that convey that file’s
> copyright and license information: Who are the copyright holders and
> under which license(s) do they release the file?
>
Continuing from above, the standardization of package-management
formats and tools has helped exactly with that: to avoid distribution
of single files, and instead provide packages and modules. It is bad
practice and considered a hack to copy files. Nobody liked that model
and everyone is moving away; it is unstructured, it becomes
unmanageable and it will cause problems.
> It is highly recommended that you keep the format of these headers
> consistent across your files. It is important, however, that you do not
> remove any information from headers in files of which you are not the
> sole author.
>
> You must convey the license information of your source code file in a
> standardised way, so that computers can interpret it. You can do this
> with an SPDX-License-Identifier tag followed by an SPDX expression
> defined by the SPDX specifications."
>
> (the text goes on for a while after this, to clarify the point but this
> is the basic gist of it)
>
> There is a nice Python tool to check:
>
> https://github.com/fsfe/reuse-tool
>
> I hope this makes sense
>
Well, it does not make complete sense. We're talking about licensing a
project. A project is not just code; there are data files (html, xml,
yaml, json files), binary files (archives/zip, images, audio, video,
etc), text files (configs, ini-files, etc) all "not-code". How do you
mark those files? Does the LICENSE file need a license-header? The
json format does not define comments, how do you add a header there?
If a binary file does not get a license header, why should a file with
code get one?
I would expect there to be a way to have the needed information
unified. If the files themselves cannot provide this information it
has to be external; thus the LICENSE file. If someone is worried about
somebody else re-using single files that do not have license
information (a python file, a png image, etc) there is really nothing
you can do (the DRM industry has been trying to solve for a long time;
and still your best bet is "social DRM").
Since, we're developing on open source with a permissive license, even
if someone does that, should we be happy that someone is actually
using what we built or sad that the files they copied did not have a
license header? And if they include the license information of that
copied file in their project's LICENSE file, is this solved?
Having pointed these contradictions, I am thinking that the "license
in every file" model seems to be a step backwards. It is introducing
overhead and does not really solve the problem, while at the same time
it enables a culture of bad practice (copying files around).
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3
*Idpy meeting 3 September 2024*
Attendees: Johan W, Johan L, Shayna, Ivan, Hannah S
0 - Agenda bash
1 - Project review
a. General - Ivan's plan is to merge things that don't break anyone's
flow.
b. OIDC libraries - https://github.com/IdentityPython (idpy-oidc,
JWTConnect-Python-CryptoJWT, etc)
- pyop - there are some changes that can go ahead, won't block anything.
and then there will be a new release -
https://github.com/IdentityPython/pyop/pull/55
- plan is still to move away from pyop, however
- more patches coming up for idpy-oidc - internal repos so no PRs.
- configuration change needed to handle redirect uris better- read
urls with special characters like spaces work with some flows
but not for
others.
- reuse indicators - there are specific use cases as to how they
are to be treated and that has been encoded into tests -
later on the code
changes. Separating what happens when reuse indicators are in place in
regards to token exchange - the two specs reference each
other but also
conflict in some ways.
- introducing new concepts around audience policies
- mechanism that allows you to state an audience
- what requirements you have for the audience. Allow multiple
values, one value, etc.
- This has nothing to do with reuse indicators where you signal
which value or values should be set as the audience.
- There are also some questions as to how things work and when
resolution takes place based on different layers - you
could request
resource X and this means the audience will get service 1
- the identifiers
can be different.
c. Satosa - https://github.com/IdentityPython/SATOSA
- Anything behind a features flag can probably be merged, such as the
logout capabilities that Hannah S and Ali have been working on.
- logout PRs that can be merged -
- https://github.com/IdentityPython/SATOSA/pull/444
- https://github.com/IdentityPython/SATOSA/pull/431
- backend/frontend connections - need some discussion - complex
- https://github.com/IdentityPython/SATOSA/pull/449
- https://github.com/IdentityPython/SATOSA/pull/450
- These will be easy to pull in:
- Apache configuration:
https://github.com/IdentityPython/SATOSA/pull/462
- Tu Wien SP configuration example:
https://github.com/IdentityPython/SATOSA/pull/469
- EntraID backend:
https://github.com/IdentityPython/SATOSA/pull/461
- documentation cleanup:
https://github.com/IdentityPython/SATOSA/pull/458
- xmlsec breaking:
https://github.com/IdentityPython/SATOSA/pull/452
- dev processes - pre-commit and flake:
https://github.com/IdentityPython/SATOSA/pull/454
- a bit harder:
- types - needs thought but can probably move forward-
https://github.com/IdentityPython/SATOSA/pull/435
- removing pyoidc, separating dependencies between SATOSA and
pysaml2 - this is a breaking change; this will require people
using SATOSA
to install pysaml2 separately now
https://github.com/IdentityPython/SATOSA/pull/442
- more involved:
- Kristof - base paths - need to make sure we're not breaking
anything. Paths that were there before should still just work.
https://github.com/IdentityPython/SATOSA/pull/451
- adding new member services - exposing information - needs to be
done a different way.
https://github.com/IdentityPython/SATOSA/pull/448
- LDAP plugins - add tests - not pressing, on hold
- backend and frontend names are unique - this PR should go in but not
in the suggested format.
- d. pySAML2 - https://github.com/IdentityPython/pysaml2
- To be merged:
- xmlenc: https://github.com/IdentityPython/pysaml2/pull/964
- EC types: https://github.com/IdentityPython/pysaml2/pull/897
- MDQ: https://github.com/IdentityPython/pysaml2/pull/959
- domain validation:
https://github.com/IdentityPython/pysaml2/pull/951 - needs a few
changes, then will be easy to pull in
- UTC https://github.com/IdentityPython/pysaml2/pull/939 - can go
in with a little bit of checking
- Windows support - these will probably be closed and done differently -
maybe using signals from garbage collector cleanup would be better as a
workaround? Really needs to be addressed by Python itself.
- https://github.com/IdentityPython/pysaml2/pull/933
- https://github.com/IdentityPython/pysaml2/pull/931
- https://github.com/IdentityPython/pysaml2/pull/665
- important: encryption algos:
https://github.com/IdentityPython/pysaml2/pull/924 - this one needs
to be checked - cannot just be merged
- dev processes -these will probably be merged - run tests when there is
a merge request opened; release packages when merge request is
merged. etc.
- https://github.com/IdentityPython/pysaml2/pull/882
- https://github.com/IdentityPython/pysaml2/pull/816
- lxml: https://github.com/IdentityPython/pysaml2/pull/940 - not
complete - it is a draft. It is a basis for using lxml everywhere in the
project. Lxml parser is Qname aware - it knows when an xml attibute
contains a namespace or a type. The default python parser does not do
anything with namespaces, so when you try to do validation, the namespace
is missing because python has optimized it away (removed it). There are
certain use cases where this is problem. Ivan may also talk to a
person who
has an xml validator which has a way of using the default python
parser but
still is able to check for those edge cases.
e. Any other project (pyFF, djangosaml2, pyMDOC-CBOR, etc)
- question on slack concerning pyff from Hannah at CERN. Ivan will try
to get to it today.
- PyFF - Ivan needs to look at this issue:
https://github.com/IdentityPython/pyFF/issues/264
- pyff - mdq - Ivan would like to have a configuration that says
output should go into either the file system (what happens now), or into
S3, or into a database (in which case you don't need a Discovery Service,
everything can be an API call). In the database case then things can be
quickly sorted and indexed the way you like. The problem is there is no
mapping between xml and a table. Need to think about how to do indexing
without the schemas, and so on. This will unlock capabilities
that we don't
have right now and also simplify what we do with the discovery service.
- Also need to change the way we parse xml into memory - can do this
within the entities descriptor. This shouldn't be hard. This
would make it
so we don't need a big machine or lots of resources to do the
parsing of a
large thing every 6 hours. Pyff could be put into a lambda, possibly.
- Or using S3 could make this a serverless process.
- SATOSA itself can also be simplified, but the whole
configuration would need to change. They have looked at moving toward a
framework like Django - not sure if this would be done as SATOSA
or SATOSA
version 2? New approach in parallel with what we have now - does
that make
sense time-wise and maintenance-wise? How to do this without
breaking what
is there now? Need to experiment with Django. Async parts of Django would
make some parts of SATOSA easier. Background things like statistics that
don't need to interact with the actual flow but need to be there
- perhaps
API call to elasticsearch to record that a new flow happened. Open
telemetry - asynchronous calls to the logger - tracing - do
these in a way
that don't affect the timing of the flow itself.
2 - AOB
- Ivan is doing a lot of work on EOSC with the AI integration.
- Next meeting - 17 September. Shayna will not be available but will
send out the meeting reminder. Ivan will take notes and send them to Shayna
to distribute.