In all analyses we paid special attention to excluding any queries which incorrect/incomplete results.
Notebook | Description | Input | Output |
01_LogsToCSV | Parses benchmarker output logs to generate queryevents.csv files | Log files (see tab Benchmarker Logs) | *_queryevents.csv, files can be found here |
02_QueryCorrectness_Interthread | For every RDF system we verify if a query always return the same result in every thread | *_queryevents.csv | None |
03_QueryCorrectness_CountQueries | Count queries always have #results = 1 => verify if the count queries have the same actual result (queries identified via hash) | Dump of full results of benchmarks, files too large for Github | CountQueryConsistency.csv has the first result per query |
04_ResultsPerQueryDF | No inconsistencies between threads so per query engine the number of results per query can be unambigously calculated here | *_queryevents.csv | resultsperquery_csv |
05_QueryCorrectness_Intersim_Watdiv | Calculate the number of results per query by consensus (WatDiv) | resultsperquery_csv | Consensus per benchmark , csv_correct/* |
06_QueryCorrectness_Intersim_Ontoforce | Calculate the number of results per query by consensus (Ontoforce) | resultsperquery_csv | Consensus per benchmark , csv_correct/* |
07_ErrorAnalysis_Ontoforce | Give an overview of all Vendor benchmarks on the Ontoforce benchmark. Overview shows a status for each query execution which can result in success, error, incorrect, timeout or unknown (engine crashed). | csv_correct/* | Fig 09 |
08_FeaturesOfFailedQueries_Ontoforce | First attempt at visualizing properties of failed Ontoforce Queries u sing parallel coordinates | csv_correct/* , ontoforce_query_features | None |
09_FeaturesOfFailedQueries_Ontoforce_2_FeatureCorrelations | Query Feature Analysis: Which query features are correlated? | ontoforce_query_features | None |
10_DTreesPerBenchmark | First Attempt at Decision Tree Analysis: which features determine outcome of a query? | csv_correct/* , ontoforce_query_features | None |
11_DTreesForAllSims | Generating Decision Trees for all simulations as one, query engine is a feature! | csv_correct/* , ontoforce_query_features | Trees All Sims |
12_DTreesForAllSims2 | Generating Decision Trees for all simulations as one, query engine is a feature! | csv_correct/* , ontoforce_query_features | Trees All Sims |
13_CachingAnalytics | Studying the effect of caching by comparing the fastest to the slowest run of a query | csv_correct/* | query_events_sorted |
14_CachingAnalytics_2 | Studying the effect of caching by comparing the fastest to the slowest run of a query | query_events_sorted | Caching Fig. |
15_BenchmarkSurvival | Calculating the benchmark survival interval | *_queryevents.csv | Figure BM Survival |
16_QueriesThatCrashSimulations | What type of query is the first one to fail, or the first one to crash a query engine? | query_events_sorted | None |
17_SingleMultiClientRuntimes | Query runtimes during warmup (single-threaded) vs stress test (5 threads) | query_events_sorted | Figure Server Load |
18_RuntimeAnalysisCSV_DiscardIncorrectQueries | Query Runtime Analysis (discarding incorrect queries!) | csv_correct/* | runtime_csv_correct |
19_RuntimeVisualAnalysisBoxplots_DiscardIncorrectQueries | Boxplots Query Runtimes | runtime_csv_correct | Figures ResultsI |
20_QueryTemplateAnalysisWatdiv_DiscardIncorrectQueries | Query Runtimes Per Query Template (WatDiv) | runtime_csv_correct | Figures ResultsII |
21_BenchmarkCostVisualizations_ExcludeIncorrectQueries | Comparing all simulations in terms of runtime cost, taking into account cloud / licensing / runtimes | runtime_csv_correct , sim_cost, loadtimes | Figure ResultsIII |