There were two presentations during this session. The first was by Matthew Lincoln, a PhD candidate from U of Maryland. The scope of his project spans 1500-1750 and he seeks to form connections between the designers, engravers, and publishers of the plethora of Dutch prints held by the British Museum. Often these prints are signed and dated and this gives Matthew and enormous amount of data to work with. He showed us subgraphs that presented a 10 year rolling window of connections between nodes (artists) and edges (prints). The two big questions he is trying to answer are: 1) Did Dutch printmaking become more or less centralized during the Golden Age? and 2) Did rising Dutch prosperity instead support more distributed network?
Next he showed us random graph generation that relates to these questions: the first graph was an Erdos-Renyi graph with edges added at random, and the second was Scale-Free with edges follow power-law distribution (few nodes have most edges and demonstrates a rich gets richer scenario). The goal of artists was to seek out the most successful people to work with: “printmakers needed expert collaborators”. The ending point Matthew made was that visualizations are great, but medium to large datasets deserve metrics-- "they need simulation, not just speculation."
The second and last presenter we heard from was Maxim Romanov, a post-doc working with a plethora of digitalized Arabic texts. Some questions he is trying to answer are: What is the volume of Islamic written legacy? How much has been digitized/published/survived/written? Maxim works with digitized libraries, and the data to extract from these libraries includes names, places, dates, book titles. He explained that one can parse these records computationally.. e.g only 10 different ways to say “he composed something”, and Maxim supplemented this with an example of library record in Arabic.
Maxim's graphs showed trends from thousands of texts. Things to look at are geographical identities about authors, their networks, and their social/religious identities. He explained that most books are in Arabic, but also Persian and Ottoman-Turkish, and the importance of religious affiliations that are pulled from the information in the records. The major specializations in the topics of these writings are Hadith, Languages, Legal, and the Quaran. The most consistent and often written about of these during the span of 662-1882 (range of his graph) was Legal-- writers never stopped creating legal literature. The last point Maxim presented on was geography-related. There are connections between authors in different geographical locations-- Andalus, to Egypt, Syria, Iraq, Iran: on the graph the size of nodes and thickness of edges determine strength and connection, and these sizes and thicknesses change over time (centuries).
Both presenters use R-- Matthew uses it for statistical data analysis and Maxim uses it for map visualization. Maxim also uses Python. It was great to find out what tools they use after having seen the presentation of their data and questions.