Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # Google Apps Script pdfToText Utility#
- This is a helper function that will convert a given PDF file blob into text, as well as offering options to save the original PDF, intermediate Google Doc, and/or final plain text files. Additionally, the language used for Optical Character Recognition (OCR) may be specified, defaulting to 'en' (English).
- Note: Updated 12 May 2015 due to deprecation of DocsList. Thanks to Bruce McPherson for the `getDriveFolderFromPath()` utility.
- ```
- // Start with a Blob object
- var blob = gmailAttchment.getAs(MimeType.PDF);
- // fileId will be the ID of a saved text file (default behavior):
- var fileId = pdfToText( blob );
- // filetext will contain text from pdf file, no residual files are saved:
- var filetext = pdfToText( blob, {keepTextfile: false} );
- // we can save other converted file types, too:
- var options = {
- keepPdf : true, // Keep a copy of the original PDF file.
- keepGdoc : true, // Keep a copy of the OCR Google Doc file.
- keepTextfile : true, // Keep a copy of the text file. (default)
- path : "attachments/today" // Folder path to store file(s) in.
- }
- filetext = pdfToText( blob, options );
- ```
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement