Skip to content

Releases: clarin-eric/standards

v2.9.0

06 Jun 12:04

Choose a tag to compare

What's Changed

Centres

This version put stress on enhancing centre-oriented calculations and the display of information:

  • centre statistics have been expanded; additional counts are provided:
  • more navigable links in centre descriptions (domains used by the centre's recommendations are now clickable)
  • more frills in centre descriptions possible (more headings added, #369, exemplified by CLARIN:EL)
  • it is no longer possible to mix centre status values across RIs (no more centres that are both B- and Operations, for example); while all the values appear in the drop-down list when editing, mixing "wrong" values results in a validation error (#396 and related);
  • centres can have more than one name (e.g. native + English, or RI-bound) [implemented in the schema only, for now; see #245 for the long, boring story and lingering TODOs].

New centre recommendations

This release coincides with several very welcome contributions of centre recommendations, submitted by several colleagues listed below, in the "New contributors" section -- many thanks to them!

Schemas

  • Assisted editing of the XML containing centre-related (and format-related) data also has become easier through tightened schemas (both XML Schema and Schematron).
  • Note that these goodies can only be used if the relevant XML documents (centre recommendations or file format descriptions) are edited in the context of their associated document grammars (recommendation.xsd and format.xsd, respectively).

Format descriptions

  • This version comes with many new format descriptions, a large portion of which has been donated following up on, or preparing for, contributions of recommendations by the colleagues listed below.
  • More navigable elements on format info pages: the data domain names are now clickable, and the message clearly states that they are derived from recommendations (there is no direct dependence of a data format on a data domain, in the SIS -- that's by design and a part of the whole trick)
  • Missing formats have received some new statistics in the sanity checker: the number of recommendations (NOT centres) that mention them is shown in brackets and can be sorted by (#372); the number of centres (NOT recommendations) that mention the given missing format is also provided, in another section of the sanity checker.
  • In the format list, the number of centres mentioning the given format is provided -- that can be useful for many reasons (#379)
  • A new closed vocabulary set for keywords in format descriptions, derived from GDFR recommendations (also used by the Library of Congress, see the doc/ directory for the theoretical basis, and issue #361 for the context); @type="gfdr" switches the <keyword> element from free text into a selection from a closed vocabulary (schema-assisted).
  • "Genetic" format families, our experimental graph for alternative navigation across formats, is tighter now (while still being more or less in alpha)
  • Explicit typing of "hub formats" has been introduced in the schema and in the format descriptions, although not yet implemented on the visualisation side. In the near future, uncommented and thus close-to-meaningless recommendations for format families ("hubs") such as "XML", "TEI" or "CoNLL" are going to be highlighted, and, optionally, these formats will be excluded from the statistics. See #407 for the lingering TODOs.
  • the <extId> element can now have an optional @label attribute; implemented in the schema and some descriptions, not yet visualised, see #420 for progress.

Data domains

  • Additional domains for widely understood machine learning applications have been added (kudos to @raspberryjoy for sharing her expertise on that).
  • Domain management has improved: now, the "Uncategorised" category is truly a grouping of data domains without a metadomain (which is not necessarily a bad state, because metadomains, as convenience groupings, make sense only where the individual domains have something tangible in common).
  • Metadomains now have short descriptions.

Updated documentation

  • The wiki pages have been re-read and redone where necessary. Especially the page on "Detailed syntax of information elements in the SIS" has been extended.
  • The /docs directory has been reworked; it now contains, among others, a vault containing a selection of documents that have been and will be useful in enhancing both the process of assigning keywords to the particular format descriptions, and in more precise retrieval of the requested information.

Lil' fings

  • Fixes of typos (broadly understood and sometimes bordering on bugs),
  • bug fixes,
  • information restructuring (e.g., modified left-side menu),
  • links to GitHub slightly redone (that should continue, also because GitHub has remodelled its issues, a few times),
  • modified prose on many pages, including quite a few format descriptions.
  • Internally, some potentially confusing functions now have full signatures, to make them easier to debug.
  • Finally, we have a new favicon! (Art and implementation by Eliza.)

Primary coding credits: @margaretha and @bansp .

New Contributors

Full Changelog: v2.8.0...v2.9.0

Standards Information System, version 2.8.0

02 Dec 02:03

Choose a tag to compare

Highlights of in version 2.8.0

Centres

  • improvements in the presentation and processing of information about centres
    • RI switch improved,
    • difference between pages for centres that curate vs. those that don't
      • explicit info about the maintainers
      • error report buttons placed differently
    • marking depositing and curated centres explicitly in the centre list
    • better logic for KPI calculations in CLARIN (still room for improvement, see #320)
    • support for closely related centres (see CLARIN-CH; see #300 for remaining tasks)
  • New centres now curate information in the SIS (many thanks to the Technical Centres Committee for keeping the flame; this is a jump from 0 to 10 since the beginning of this year)

Inter-RI

  • all CLARIN Text+ are in now, with links to Text+ registry
    • linking from Text+ to the SIS is a pending issue
    • (non-CLARIN Text+ centres are still to-do, see #333)
  • DARIAH centres link to DARIAH registry

Formats

  • Several format descriptions added, many improved (keywords, cross-references, external refs)
  • New data domain: Packaging

Statistics

  • New statistics, some pages rearranged between statistics proper and sanity checking

Documentation

  • extended the About page and updated/extended several others,
  • more cross-referencing across the pages,
  • initial steps towards reviving the Watchtower section

Engine

  • Upgrade to eXist 6.3.0 (and a corresponding update to the wiki docs)

+ numerous under-the-hood changes and fixes (a.o. logic, export, navigation, schemas, accessibility, favicon; cleanup)

Coding credits: @margaretha , @bansp

New Contributors (thank you and welcome to the repo!)

Full Changelog: v2.7.0...v2.8.0

Standards Information System, version 2.7.0

24 May 11:07

Choose a tag to compare

This release corresponds, with minor enhancements, to the paper "Standards Information System for CLARIN centres and beyond" by Piotr Bański and Eliza Margaretha Illig, published in the proceedings volume of the CLARIN Annual Conference 2023.

Highlights

  • Support for research infrastructures other than, but related to, CLARIN (placeholder for DARIAH, pilot coverage for Text+)
  • Enhanced centre descriptions (maintainers, warning when no maintainers set, list of domains used in the recommendations, sensitivity to the RI switch)
  • Fixed centre modelling (no more listing centres separately from their recommendations; the centre list is now a function of what the recommendations/ directory contains); one consequence is greater control by the centre over all of its data
  • Additions to the formats/ directory (new docs, updates / extensions of the existing ones)
  • Small but essential: functionality enhancements and bug fixes; schema enhancements, including annotations and Schematron
  • Visual cosmetics, updates in the general information content

... the above were authored by @margaretha and @bansp (credited also as 'piotr')

Centre recommendation updates

  • Adding SBX recommendation formats by @ljo in #243
  • KP-7936 Update FIN-CLARIN-recommendation.xml (#1) by @mmatthiesencsc in #250
  • update SAW recommendation by @redfarg in #264
  • updates of IDS recommendations and two maintainers (for textual and audiovisual formats, separately); kudos to Marc Kupietz and Harald Lüngen for their work on the textual part, and to Mark-Christoph Müller for the adjustments in the audiovisual realm.

Full Changelog: v2.6.0...v2.7.0

Standards Information System, version 2.6.0

20 Sep 13:00

Choose a tag to compare

The highlights of this release are:

  • support for research infrastructures other than CLARIN (heavy stress on Text+, experimentally also DARIAH)
  • enabling of a different way to traverse formats, along their hierarchical relationships (this is referred to as format family tree and is only linked from the "Popular Formats" menu item)

Other worthwhile mentions:

  • the list on the Data Deposition Formats page can now be both filtered and searched
  • the centre recommendation page supports curation info (red warning if no maintainer set, otherwise the maintainer is mentioned, e.g. by a GitHub handle)

As a consequence of work on extending the RI coverage and of the first inputhon (at IDS Mannheim), the format of the descriptions of centres and of particular recommendations became more expressive (formats can be referenced, and language tags can be used). Documentation was extended and updated (though that is not part of this code release; the wiki documentation lives in a separate repository).

Apart from that, bugs have been squashed, format descriptions have been extended.

Full Changelog: v2.5.0...v2.6.0

v2.5.0

15 Apr 18:39

Choose a tag to compare

What's Changed

  • usability fixes: better search, better placement of information, better navigation (a.o., #196 #153 #125 #187 )
  • enhanced statistics: CLARIN KPI derived automatically from the data in the SIS (#180 )
  • further steps taken towards assisting centres in submitting information (schemas, documentation, preparing for "inputhon" as a once-only event for a centre to be done with its data)
  • adjust recommendation labels (see #33 )
  • more documentation: the "about / FAQ" page, more content on several other pages (a.o. #171 #205 )
  • more articulate steps towards extending the SIS's coverage beyond CLARIN (Text+ as a very promising direction) and API adjustments (a.o. #166 #206 #207 )
  • new banner installed, with thanks to Elisa Gorgaini (#175 )
  • format families graph secretly linked and steps taken towards gradually making it more usable as an alternative format-browsing tool (expected around 2.6.0)
  • DB update to eXist 6.2.0 (#162 )
  • bug fixes

See also the corresponding milestone: https://github.com/clarin-eric/standards/milestone/10

Full Changelog: v2.4.0...v2.5.0

Standards Information System, version 2.4.0

10 Oct 22:14
4c24c2e

Choose a tag to compare

This is a maintenance release that wraps up the fixes done over the summer of 2022, before the CLARIN Annual Conference.
One new feature has been introduced: a Schematron constraint preventing the duplication of format IDs -- in order to facilitate the creation of new format descriptions.
See also the corresponding milestone (though bear in mind that some issues have been transferred to v.2.5.0, due to them not having been fully researched).

Full Changelog: v2.3.0...v2.4.0

Standards Information System, version 2.3.0

30 May 15:21

Choose a tag to compare

What's Changed

  • More pronounced division between the format and the standards section, with development focusing on the former, per the Standards Committee's decision;
  • functionality for more pronounced statistics, including formats most "popular" in the given domain and laying the basis for computing the format-related KPI measurement;
  • better search options, functionality for the use of keywords (cloud now on the front page) and for implementing visualisation of formal format families (as opposed to functional domains);
  • consolidated sanity checks;
  • credits: @bansp, @margaretha ;
  • see the corresponding milestone for discussions of the particular issues.

Some of the choices leading to this version are discussed in "Standards in CLARIN" by Piotr Bański and Hanna Hedeland -- a chapter of the forthcoming anniversary CLARIN book.

Full Changelog: v2.2.0...v2.3.0

Standards Information System, version 2.2.0

09 Mar 10:55

Choose a tag to compare

Much of what has happened can be seen in the milestone for this release: https://github.com/clarin-eric/standards/milestone/7?closed=1 although bear in mind that some tickets have been promoted to the next milestone if they turned out to be complex or not ripe enough.

In short:

  • a lot of internal cleaning and beautification took place
  • the information is spread across new pages (e.g. each centre now has an information page, and we have taken the first steps towards possibly extending the SIS to research infrastructures other than CLARIN)
  • we have a working sanity checker (the guiding ticket is #115 )
  • we have a working API (https://clarin.ids-mannheim.de/standards/views/api.xq )
  • the job of centre representatives hopefully got easier -- we now provide templates (#42 ) and the schemas contain closed lists of options together with their descriptions

Full changelog: v2.1.0...v2.2.0

Standards Information System, version 2.1.0

07 Jan 12:56
fbada7c

Choose a tag to compare

Release 2.1.0 of the Standards Information System bug fixes and many enhancements over SIS 2.0.0 concerning:

  • upgrade from eXist 5.2 to eXist 5.3.1 (prompted by the log4j security leak),
  • enhanced display of information on formats (including connections to external taxonomies)
  • enhanced management of information about centres,
  • the ease of contributing information by centre representatives.

The code release accompanies release 1.0 of CLARIN data format recommendations.

Full Changelog: v2.0.0...v2.1.0

Standards Information System, version 2.0.0

21 Nov 13:32

Choose a tag to compare

Our biographers are going to be devastated the day they learn about the original description of this release getting lost due to a silly error... Still, let's try to make them at least minimally happy:

Release 2.0.0 was the first major release of the SIS after a series of upgrades and with an extension of the data model to encompass centre recommendations of data-deposition formats. The release accompanied release 1.0 of said recommendations.

Full Changelog since July 2021: v2.0.0-beta...v2.0.0

Full Changelog since time immemorial: v1.0.0...v2.0.0