Setting up Jena Fuseki using Docker image

 Apache Jena is a free and open source Java framework for building Semantic Web and Linked Data applications. The framework is composed of different APIs interacting together to process RDF data, as we can see from the figure below.

The focus of this post is about setting up Fuseki using Docker. In short, Fuseki a SPARQL server which can present RDF data and answer SPARQL queries over HTTP. 

The content of this post is as follows:

  • Run Fuseki server with Docker image
  • Configuration of Fuseki data service
  • Test SPARQL queries on the server using cURL


Run Fuseki server with Docker image

If you are familiar with Docker, it is very convinient to set up a Fuseki server using a Fuseki Docker image from the Docker Hub, which is a cloud-based repository service provided by Docker for finding, storing, and sharing container images. 

I have Docker Desktop in my laptop which provides an integrated development environment (IDE) for building, shipping, and running containerized applications using Docker. 

After pulling the Fuseki Docker image, we can follow the instruction to run a Fuseki server already:


docker run --rm -it -p 3030:3030 --name fuseki -e ADMIN_PASSWORD=[PASSWORD] -e ENABLE_DATA_WRITE=[true|false] -e ENABLE_UPDATE=[true|false] -e QUERY_TIMEOUT=[number in milliseconds] --mount type=bind,source="$(pwd)"/fuseki-data,target=/fuseki-base/databases secoresearch/fuseki

The server should be accessible at http://localhost:3030.


Configuration of Fuseki data service

So far so good. One thing to note is that Fuseki server is running with default settings or configurations. According to the instruction in the Fuseki Docker image page, it also mentioned that we need to add configuration file assembler.ttl file under the fuseki-configuration/ folder. You can find the ttl file on the GitHub repo of the Fuseki Docker image provider. The instruction to run the Fuseki server with custom configuration is as follows:


mkdir fuseki-data
mkdir fuseki-configuration
cp -p assembler.ttl fuseki-configuration/
# edit fuseki-configuration/assembler.ttl to enable the endpoints you wish
docker run --rm -it -p 3030:3030 --name fuseki -e ADMIN_PASSWORD=[PASSWORD] -e QUERY_TIMEOUT=[number in milliseconds] --mount type=bind,source="$(pwd)"/fuseki-data,target=/fuseki-base/databases --mount type=bind,source="$(pwd)"/fuseki-configuration,target=/fuseki-base/configuration secoresearch/fuseki
Otherwise, we can also come up with our own configuration file based on the Fuseki Data Service Configuration Syntax. For example, in my case, I would like to simply test with some dummy RDF data loaded when the server starting up, so I also set up a MemoryModel for a .ttl file containing the RDF data I'm interested in. Everytime the server is starting, it contains the RDF dataset that I can play around with, and run some SPARQL queries over the dataset.



<#service> rdf:type fuseki:Service ;
    fuseki:name              "ds" ;   # http://host:port/ds
    fuseki:dataset           <#tdb> ;
    fuseki:endpoint [ 
         # SPARQL query service
        fuseki:operation fuseki:query ; 
        fuseki:name "sparql"
    ] ;
    
    ... ...
        
<#tdb>    rdf:type ja:RDFDataset ;
    rdfs:label "EnergyConsumption" ;
    ja:defaultGraph
      [ rdfs:label "DAYTON.ttl" ;
        a ja:MemoryModel ;
        ja:content [ja:externalContent <ttl file location>] ;
      ] ;
    .

Test SPARQL queries on the server 

If you are planning to interact with the Fuseki server set up in your program such as using Python, you might need to test SPARQL queries via HTTP out first. One way is directly using the browser and type your endpoint with your query parameter:



http://localhost:3030/ds/sparql?query=SELECT%20*%20WHERE%20{?s%20?p%20?o}%20limit%203

If you are using curl for testing a SPARQL query, you can submit a URL-encoded SPARQL query 


curl "http://localhost:3030/ds/sparql?query=SELECT%20*%20WHERE%20\{?s%20?p%20?o\}%20limit%203"

It seems escapes for starting and ending brackets are needed. For more details regarding using curl for SPARQL queries, one can refern to cURLing SPARQL, which contains much more details regarding the topic.