Datasets Management

The datasets table

When the administrator access the Administration dashboard he is redirect directly to the Datasets manager panel. This panel simply contains a table with the list of datasets hosted in ShowVoc. Here follows a description of the table headers:

Datasets management

From the Datasets manager panel, administrator user is able to create datasets and to manage them.

Create a dataset

To create a new dataset click on + button on the top right corner of the panel. You will be prompted with a window like the one in figure below:

Here you have to fill (in order of appearace) the following fields:

Actions on dataset

Under the last column of the table, a dropdown menu allows the management of a dataset

The available actions are:

Loading data

In order to load data into a dataset, user needs to fill the following form prompted once Load data entry has been clicked.

Dataset labels

A dataset is identified by its name which, once created, cannot be modified. This limitation is due to the fact that the project name is actually an identifier which is widely referenced in the data structure of SemanticTurkey. Anyway, it could be convenient for the user to visualize the dataset with a more user-friendly label. So, in order to overcome this limitation, ShowVoc allows authorized users to assign multilingual labels to the datasets.
By clicking on the Edit labels entry, under the dataset menu, the following editor is available.

Here it is possible to add, edit and delete labels in several languages (only one label per language). Then, according the active UI language and the rendering status (), which can be set in the Dataset Manager dashboard or in the Datasets page, the datasets will be widely shown through the associated label as in the following examples.

Dataset settings

Hosted Public datasets can be accessed by visitor users through the Datasets page. From the Datasets Settings dialog, administrator can modify default settings for the visitors.

The content of this dialog partially depends from the dataset model. For any kind of dataset the Rendering Languages setting panel is available. This panel sets the default language(s) that can be considered for rendering the resources in the Data View. The defined languages can be sorted by name, code or the rendering order ("position" column). The "active" column determines which languages have been selected for being shown. No selection is equivalent to "show all languages". As just stated, this represents a default dataset setting, in fact visitor users can customize them as they preferred.

As is shown in the previous screenshot a further Concept tree settings panel is available. This depends on the dataset model: it is indeed available only when editing settings of SKOS datasets, while for OntoLex datasets a similar Lexical Entry list settings is present. Basically, both of these settings tells the default mode for inspecting the data (concept or lexical entry view). The mode can be forced or limited so that visitor users experience is affected. For instance, administrator may choose to revoke to visitors the possibility to switch from one visualization mode to another (through the Allow visualization change option) and forcing them to the choice made here.

More details about these settings in this page

Dataset status

As seen previously, the last column of the table shows the current status of the datasets. What exactly means the status? In order to answer to this questioni it is necessary to briefly explain how a dataset is created in ShowVoc.

A dataset in ShowVoc can be created in two ways: explicitly by the administrator, as seen before, or after a submission of contribution request for a stable resource. Once the contribution request is evaluated and accepted by the administrator, a new dataset is created. This new created and empty dataset is then in the Pristine status. The contributor can then proceed to load the data into the dataset. Once the data is successfully loaded, the dataset is no more empty and moves from the Pristine status to the Staging one. A dataset in the Staging status is not visible to a "simple" user, a visitor, this means that in the Datasets page it will not appear in the datasets list and also the contained resources will not be returned by the Search feature. In short words, the Staging status can be described as private dataset, only visible to the administrator. In order to make visible the dataset with its content, the administrator needs to switch the status to Public.
A dataset created by the administrator through the Create dataset form is instead in Staging status even if it is empty. So, summing up, a dataset goes through the Pristine status only when created by means a contribution and waits for data to be loaded.

A Public dataset can be later moved back again to the Staging status in order to restrict its visibility, and in the same way a Staging dataset can be made Public again. The only status that cannot be set or changed manually is the Pristine one, which is automatically assigned by the system once the dataset is created and then it's removed (in favour of the Staging) once the data is loaded by the contributor.
The intermediate stage Staging, between Pristine and Public, has been introduced not just for handle the dataset visibility, but also for allowing the administrator to inspect the data that a contributor has loaded before to make it public and to eventually prevent incorrect data from being published on ShowVoc.

When a dataset switches between the Staging and Public status, administrator can decide also whether to create/delete the index used for the search feature.

Dataset facets

In case of large amount of datasets loaded in ShowVoc, the default view, which consists in a flat list of datasets, may result poorly organized and difficult to read. This is where the facts-based view comes useful. This view allows you to group datasets according a chosen facet in order to make the view cleaner and organized efficiently.

The facets are aspects and attributes that characterize a dataset. ShowVoc provides six pre-defined facets: Model, Lexicalization, which are unmodifiable attributes chosen during the creation of the dataset, Category and Organization.

In addition to such pre-defined facets, ShowVoc allows the administrator to define custom facets.

Clicking the "cog" button in the top right corner of the panel and selecting the entry Custom project facets schema settings, the following dialog is prompted.

A new facet can be added through the "plus" button. Then the editor shows several fields that can be filled in order to define a new facet:

Clicking on the Edit facets entry in the dataset context menu (under Actions column), the following dialog shows up and it allows the editing of the dataset facets. As you can see, there are the built-in facets Category and Organization plus the custom one just defined.

Now that we know what facets are, how to define customs and how to set them in a dataset, let's see how can they be used to organize the dataset view. As we have already seen, the "cog" button, placed on the topbar of the dataset list, opens a menu. Selecting Dataset view settings, user can customize the visualization of the dataset view.

Here it is possible to choose between two visualization mode: List, namely the "classic" flat list of datasets, and Facet based. Selecting the latter, it appears a Facet selector which allows the selection of the facet on which the datasets have to be grouped.

Here it is an example of facets-based visualization based on Category. The categories here shown have been assigned arbitrarily by the administrator. The last one "Unclassified" is a dedicated group that collects those datasets where the category (or in general the chosen facet) has not been specified.

The possibility to change the visualization preferences through the Datasets view settings dialog, is also available in the Datasets page.

Super User capabilities

The Datasets manager table is also accessible by Super User. From this page, Super User is authorized to create new datasets and to perform most of the actions described in this page with some exceptions. In particular: