Text as Data: Methods Appendix
Keyword(s):
This concluding chapter provides more details about the classification of the nearly 170,000 House press releases used in this study as credit claiming or not. Making use of recent Text as Data methods, the study begins with 800 triple-hand-coded documents, providing a label for each of the press releases. The idea is to learn a relationship between the hand-coded labels and the words in the texts. This relationship will then be used to predict the label for all the remaining documents. The result of the process is that all the press releases will be labeled. The chapter then presents a series of simplifying assumptions that make statistical modeling of the texts feasible.
2017 ◽
Vol 25
(3)
◽
pp. 274-295
◽
Keyword(s):
Keyword(s):
2019 ◽
Keyword(s):
2020 ◽
Vol 6
(2)
◽
pp. 151
Keyword(s):
Keyword(s):
2016 ◽
Vol 5
(1)
◽
pp. 88-111
◽
Keyword(s):
Keyword(s):
2014 ◽
Vol 13
(03)
◽
pp. C02
◽
Keyword(s):