Datasets Management

The datasets table

When the administrator access the Administration dashboard he is redirect directly to the Datasets manager panel. This panel simply contains a table with the list of datasets hosted in ShowVoc. Here follows a description of the table headers:

Datasets management

From the Datasets manager panel, administrator user is able to create datasets and to manage them.

Create a dataset

To create a new dataset click on + button on the top right corner of the panel. You will be prompted with a window like the one in figure below:

Here you have to fill (in order of appearace) the following fields:

Actions on dataset

Under the last column of the table, a dropdown menu allows the management of a dataset

The available actions are:

Loading data

In order to load data into a dataset, user needs to fill the following form prompted once Load data entry has been clicked.

Dataset labels

A dataset is identified by its name which, once created, cannot be modified. This limitation is due to the fact that the project name is actually an identifier which is widely referenced in the data structure of SemanticTurkey. Anyway, it could be convenient for the user to visualize the dataset with a more user-friendly label. So, in order to overcome this limitation, ShowVoc allows authorized users to assign multilingual labels to the datasets.
By clicking on the Edit labels entry, under the dataset menu, the following editor is available.

Here it is possible to add, edit and delete labels in several languages (only one label per language). Then, according the active UI language and the rendering status (), which can be set in the Dataset Manager dashboard or in the Datasets page, the datasets will be widely shown through the associated label as in the following examples.

Dataset settings

Hosted Public datasets can be accessed by visitor users through the Datasets page. From the Datasets Settings dialog, administrator can modify default settings for the visitors.

This Languages panel sets the default language(s) that can be considered for rendering the resources in the Data View. The defined languages can be sorted by name, code or the rendering order ("position" column). The "active" column determines which languages have been selected for being shown. No selection is equivalent to "show all languages". As just stated, this represents a default dataset setting, indeed visitor users can customize them as described in the Data Structure View section here.

Moving to the ResourceView tab, it is possible to customize several aspects of the ResourceView.

The structure of the ResourceView is guided by a template. Every type of resource defined in ShowVoc is mapped with a pre-defined template, for example, the default template for concepts has the following sections (or partitions): Types, Top Concept of, Schemes, Broaders, Lexicalizations, Notes and Other properties. From the Template panel it is possible to edit these mappings between resource types and sections. After the selection of a resource type, the two Sections lists show respectively: the sections assigned to the related template (in the left list) and the rest available (in the right one). A partition can be added to, or removed from, the visible ones through the two arrow buttons between the list (after selecting a partition). An active section can be also moved in a different position in the list by using the buttons on the panel header.

The composition of a single template section is based on a statements consumer which, as the name suggests, "consumes" those statements of a resource that involves specific managed properties. For example, the Types section is bound to a consumer which manages rdf:type and its sub-properties. So, defining a new section means, under the hood, defining a new consumer. In order to achieve that, we can use the Custom Sections panel which is splitted in two sub-panels: Sections and Managed properties. The first one basically allows the creation of new custom sections, the renaming or the deletion of the created ones. In the second one user can define the list of managed properties, namely those properties that will be processed by the consumer and for which the related values will be collected under the section.
Note that a consumer processes also the sub-properties of its managed properties, moreover, the properties are processed only once by the first consumer that manages them, this means that if we define a custom section that consumes rdf:type property, it will have no effect if it is placed after Types section (that already consumes the same property), on the contrary, if placed before Types, our custom consumer will process rdf:type, the related section will show its values and Types will result in an empty section.

An additional panel under ResourceView tab is the Value-filter languages. The behavior of this component is already explained under the section ResourceView available at Data View page.

Finally, the third tab (Other) collects further settings for the dataset. The content of this tab mostly depends on the dataset model. As it is shown in the following image, a Concept tree settings panel is available since it is editing the settings of a SKOS dataset. Depending on the dataset model we could have different panels, for example in case of Ontolex datasets it will be shown also an editor for Lexical Entry list settings, or again, in case of OWL datasets it will prompted an editor for Instance list settings.

Basically, these settings tell the default mode for visualizing the data in the structures (trees or lists). The mode can be forced or limited so that visitor users experience is affected. For instance, administrator may choose to revoke to visitors the possibility to switch from one visualization mode to another (through the Allow visualization change option) and forcing them to the choice made here.

More details about these settings in this page.

Dataset managers

The "Edit Dataset Managers" entry in the actions menu of a dataset opens the following dialog.

Within this editor it is possible to manage the project manager users associated with the specific dataset. This functionality enables administrators to add or remove project managers simply by moving users between Managers and Registered users columns. Administrator users cannot be removed from the list of managers.

Dataset status

As seen previously, the last column of the table shows the current status of the datasets. What exactly means the status? In order to answer to this questioni it is necessary to briefly explain how a dataset is created in ShowVoc.

A dataset in ShowVoc can be created in two ways: explicitly by the administrator, as seen before, or after a submission of contribution request for a stable resource. Once the contribution request is evaluated and accepted by the administrator, a new dataset is created. This new created and empty dataset is then in the Pristine status. The contributor can then proceed to load the data into the dataset. Once the data is successfully loaded, the dataset is no more empty and moves from the Pristine status to the Staging one. A dataset in the Staging status is not visible to a "simple" user, a visitor, this means that in the Datasets page it will not appear in the datasets list and also the contained resources will not be returned by the Search feature. In short words, the Staging status can be described as private dataset, only visible to the administrator. In order to make visible the dataset with its content, the administrator needs to switch the status to Public.
A dataset created by the administrator through the Create dataset form is instead in Staging status even if it is empty. So, summing up, a dataset goes through the Pristine status only when created by means a contribution and waits for data to be loaded.

A Public dataset can be later moved back again to the Staging status in order to restrict its visibility, and in the same way a Staging dataset can be made Public again. The only status that cannot be set or changed manually is the Pristine one, which is automatically assigned by the system once the dataset is created and then it's removed (in favour of the Staging) once the data is loaded by the contributor.
The intermediate stage Staging, between Pristine and Public, has been introduced not just for handle the dataset visibility, but also for allowing the administrator to inspect the data that a contributor has loaded before to make it public and to eventually prevent incorrect data from being published on ShowVoc.

When a dataset switches between the Staging and Public status, administrator can decide also whether to create/delete the index used for the search feature.

Dataset facets

In case of large amount of datasets loaded in ShowVoc, the default view, which consists in a flat list of datasets, may result poorly organized and difficult to read. This is where the facts-based view comes useful. This view allows you to group datasets according a chosen facet in order to make the view cleaner and organized efficiently.

The facets are aspects and attributes that characterize a dataset. ShowVoc provides six pre-defined facets: Model, Lexicalization, which are unmodifiable attributes chosen during the creation of the dataset, Category and Organization.

In addition to such pre-defined facets, ShowVoc allows the administrator to define custom facets.

Clicking the "cog" button in the top right corner of the panel and selecting the entry Custom project facets schema settings, the following dialog is prompted.

A new facet can be added through the "plus" button. Then the editor shows several fields that can be filled in order to define a new facet:

Clicking on the Edit facets entry in the dataset context menu (under Actions column), the following dialog shows up and it allows the editing of the dataset facets. As you can see, there are the built-in facets Category and Organization plus the custom one just defined.

Now that we know what facets are, how to define customs and how to set them in a dataset, let's see how can they be used to organize the dataset view. As we have already seen, the "cog" button, placed on the topbar of the dataset list, opens a menu. Selecting Dataset view settings, user can customize the visualization of the dataset view.

Here it is possible to choose between two visualization mode: List, namely the "classic" flat list of datasets, and Facet based. Selecting the latter, it appears a Facet selector which allows the selection of the facet on which the datasets have to be grouped.

Here it is an example of facets-based visualization based on Category. The categories here shown have been assigned arbitrarily by the administrator. The last one "Unclassified" is a dedicated group that collects those datasets where the category (or in general the chosen facet) has not been specified.

The possibility to change the visualization preferences through the Datasets view settings dialog, is also available in the Datasets page.

Super User capabilities

The Datasets manager table is also accessible by Super User. From this page, Super User is authorized to create new datasets and to perform most of the actions described in this page with some exceptions. In particular: