Studi sul Cristianesimo Primitivo

Investigating the Authenticity of Pliny the Younger’s Letter to Trajan Concerning the Christians

« Older   Newer »
  Share  
Porterble
view post Posted on 4/3/2016, 22:22 by: Porterble     +1   -1




QUOTE (Saulnier @ 4/3/2016, 21:07) 
QUOTE (Porterble @ 4/3/2016, 11:43) 
I was wondering whether you have considered repeating the analysis using LSA as the document comparison method? One such option may be to use the methodology of Satyam et al 2014 upon the n-gram analysis, or to simply apply LSA to the text itself and use the whole of the Plinian corpus as the comparator.

Thank you for this suggestion, I have not considered using Latent Semantic Analysis on n-grams. It sounds interesting. Can you provide some references?

A method such as Satyam et al would work, using LSA upon the n-gram dataset:

The longer path would be to create a reference corpus of all of Pliny's works, and then conduct an n-way LSA comparison between the documents to build another similarity comparison. The LSA engine on the ColoradoU site could theoretically be used for this but the data might need massaging into another format as last time i was heavily using the LSA engine it didn't support unicode properly. However, it may have been updated more recently, as I haven't worked properly in computational linguistics since 2010 or so (most of my work was involved with the analysis of decision making and rhetorical intent).

Also, i was wondering whether anyone had done stylometric work of this kind considering the presence of amanuensis'?

Hm, the forum keeps stripping out the URLs. The Satyam paper is entitled:
A Statistical Analysis Approach to Author Identification Using Latent Semantic Analysis: Notebook for PAN at CLEF 2014
by Satyam, Anand, Arnav Kumar Dawn, and Sujan Kumar Saha

While the LSA engine is at lsa (dot) colorado (dot) edu.
 
Top
16 replies since 15/2/2016, 21:23   694 views
  Share