You can import text data from a PDF document using the pdftools
package. Here is how.
- Open the “Manage R package” dialog from the project menu.
- Click the “Install New Packages” tab, type in
pdftools
and click “Install”.
Click the “+” icon at the Data Frame and select “R Script” to create a new R Script Data Frame.
Type in the following text. Replace <FILE_PATH_TO_PDF> with a full path to your PDF file.
pdf.text <- pdftools::pdf_text("<FILE_PATH_TO_PDF>")
data.frame(text=pdf.text)
You get full text in a single column called ‘text’. To extract the data that you need, you can check out the Text Wrangling document.