Advertisement
rccharles

ASC adjust clipboard May 20, 2019

May 20th, 2019
2,192
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. (*
  2.   This applescript converts clipboard input into a format suited for pasting into an ASC
  3.   reply.  I observed that my copies into an ASC reply were not formated that well.  
  4.   I observed that copies from a web browser were formated much better.  I went about
  5.    adjusting the clipboard copy to the format expected by a web browser for best results.
  6.  
  7.  This applescript accepts the clipboard in either
  8.  -- plain text upon which the text is converted to HTML.  Conversion is limitted to inserting paragraph tags for blank lines and inserting links where http or https text appears. The page title is substituted for the link.  
  9.  -- HTML source code identified by text containing HTML markup.  
  10.          Caveat emptor.  
  11.  
  12.  to use:
  13.  1) copy command + c what data you want to convert
  14.  2) run this applascript by double clicking on the app.
  15.  3) paste command + V into an ASC reply
  16.  
  17.  I have tested in Waterfox 56.2.9 in Yosemite.  I assume the process will work with other web browsers and other versions of macOS.
  18.  
  19.  Save as an Application Bundle.  Don't check any of the boxes.
  20.  
  21. Should you experience a problem, run in the Script Editor.
  22.   Shows how to debug via on run path. Shows items added to folder. Shows log statement.
  23.    It is easier to diagnose problems with debug information. I suggest adding log statements to your script to see what is going on.  Here is an example.  
  24.  
  25.  For testing, run in the Script Editor.
  26.         1) Click on the Event Log tab to see the output from the log statement
  27.       2) Click on Run
  28.    
  29. change log
  30. may 1, 2019   -- skip 403 forbidding title
  31. may 2, 2019   -- convert \" to ".  the \" mysteriously appears in HTML source code input.  Probably some TextEdit artifact.
  32.                 copy to TextEdit copy out of TextEdit.
  33. may 7, 2019   -- regressed May 2nd update.  Applescript was inserting \" for display purposes into output.
  34. may 8, 2019   -- special processing for html class on clipboard
  35.                         https://pastebin.com/raw/Yg138YqT
  36. may 16,2019  -- fixed hexDumpFormatOne bugs and improved output
  37. may 16,2019  -- added hexDumpFormatZero
  38. may 19,2019  -- eliminate line breaks outside the <pre>...</pre> tags in HTML. ASC intrepreting line
  39.                         breaks as meaningful <br>
  40.                         instead of white space.simplified line break code.
  41.                        https://pastebin.com/raw/Nq08cFYH
  42.        
  43.  
  44. enhancements:
  45.  -- get pdf title
  46.  
  47.  
  48. Author: rccharles
  49.  
  50. Copyright 2019 rccharles  
  51.      
  52.       Permission is hereby granted, free of charge, to any person obtaining a copy  
  53.       of this software and associated documentation files (the "Software"), to deal  
  54.       in the Software without restriction, including without limitation the rights  
  55.       to use, copy, modify, merge, publish, distribute, sublicense, and/or sell  
  56.       copies of the Software, and to permit persons to whom the Software is  
  57.       furnished to do so, subject to the following conditions:  
  58.        
  59.       The above copyright notice and this permission notice shall be included in all  
  60.       copies or substantial portions of the Software.  
  61.        
  62.       THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR  
  63.       IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,  
  64.       FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE  
  65.       AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER  
  66.       LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,  
  67.       OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  
  68.       SOFTWARE.  
  69.  
  70.  
  71.     example text document: remember to edit out the returns.
  72.     set the clipboard to «data HTML3C68746D6C3E3C686561643E3C6D65746120687474702D65717569763D22636
  73.     F6E74656E742D747970652220636F6E74656E743D22746578742F68746D6C3B206368617273657
  74.     43D7574662D38223E3C2F686561643E3C626F64793E3C62723E0A202020203C62207374
  75.     796C653D22636F6C6F723A677265656E3B223E506172616C6C656C733C2F623E3A3C62723E0A2
  76.     0202020467265652076657273696F6E206F6620506172616C6C656C7320666F7220696E6
  77.     46976696475616C207573653A3C62723E0A68747470733A2F2F6974756E65732E6170706C652E
  78.     636F6D2F75732F6170702F706172616C6C656C732D6465736B746F702D6C6974652F696
  79.     4313038353131343730393F6D743D31323C62723E0A2020202046756C6C2076657273696F6E3C6
  80.     2723E0A202020203C6120687265663D22687474703A2F2F7777772E706172616C6C656C
  81.     732E636F6D2F656E2F70726F64756374732F6465736B746F702F223E687474703A2F2F7777772E7
  82.     06172616C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F3C2F6
  83.     13E3C62723E0A202020203C62723E0A202020203C623E564D7761726520467573696F6E3C2F62
  84.     3E3C62723E0A202020205769746820564D7761726520467573696F6E2C2072756E20746
  85.     865206D6F73742064656D616E64696E67204D616320616E642057696E646F77730A20202020617
  86.     0706C69636174696F6E7320730A6964652D62792D73696465206174206D6178696D756
  87.     D2073706565647320776974686F7574207265626F6F74696E673C62723E0A20202020687474703A
  88.     2F2F7777772E766D776172652E636F6D2F70726F64756374732F667573696F6E2F3C2F62
  89.     6F64793E3C2F68746D6C3E»
  90.    
  91.     Translated text is:
  92.     Full version<br>
  93.    <a href="http://www.parallels.com/en/products/desktop/">http://www.parallels.com/en/products/desktop/</a><br>
  94.    <br>
  95.    <b>VMware Fusion</b><br>
  96.    
  97.     set the clipboard to «data HTML2020202046756C6C2076657273696F6E3C62723E0A202020203C612068726566
  98.     3D22687474703A2F2F7777772E706172616
  99.     C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F223E687474703
  100.     A2F2F7777772E706172616C6C656C732E63
  101.     6F6D2F656E2F70726F64756374732F6465736B746F702F3C2F613E3C62723E0A20202020
  102.     3C62723E0A202020203C623E564D7761726520467573696F6E3C2F623E3C62723E0A»
  103.    
  104. set the clipboard to "Saturday, September 7, 2019
  105. Live streamed
  106. https://www.omf.ngo/community-symposium-2/"
  107.  
  108. set the clipboard to "\"Effective defenses 111 threats\" by John Galt
  109. https://discussions.apple.com/docs/DOC-8841
  110. \"Avoid phishing emails 222 and other scams\""
  111.  
  112. https://support.apple.com/en-ca/HT204759
  113.  
  114. blank lines
  115. also,see:http://www.google.com/ seeing again:http://www.google.com"
  116.  
  117. *)
  118.  
  119. (* For whatever reason, this segment doesn't work when moved above.
  120.     set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
  121. <p>\"Effective defenses against malware and other threats\" by John Galt
  122. <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
  123. defenses against malware and ot… - Apple Community</a>
  124. </p><p> </p>"
  125.    *)
  126. (*
  127. set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
  128. <p>\"Effective defenses against malware and other threats\" by John Galt
  129. <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
  130. defenses against malware and to… - Apple Community</a>
  131. </p><p> </p>"
  132. *)
  133. (*
  134.     set the clipboard to "Saturday, September 7, 2019
  135. Live streamed
  136. https://www.omf.ngo/community-symposium-2/"
  137. *)
  138.  
  139. (*
  140. set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
  141. <p>\"Effective defenses against malware and other threats\" by John Galt
  142. <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
  143. defenses against malware and ot… - Apple Community</a>
  144. </p><pre>
  145. code line #a
  146. code line #b
  147. code line #c
  148. </pre><p> </p><p>\"Avoid phishing emails, fake 'virus' alerts, phony support calls, and other scams\"
  149. <a href=\"https://support.apple.com/en-ca/HT204759\" target=\"_blank\">Avoid phishing emails, fake
  150. 'virus' alerts, phony support calls, and other scams - Apple Support</a>
  151. <pre>
  152. code line #1
  153. code line #2
  154. code line #3
  155. </pre>
  156. </p><p> </p><p></p>Run etrecheck.  The first five runs are free. Provided a report on your
  157. machines hardware and software.  Great for diagnosing your system.  Click on the download
  158. link at the bottom of the screen.
  159. <a href=\"http://etrecheck.com/\" target=\"_blank\">EtreCheck</a></p><p></p>
  160. <p></p><p>
  161. <ol>
  162. <li>point 1</li>
  163. <li>point 2</li>
  164. <li>point 3</li>
  165. </ol>
  166. </p>
  167. <p>the end</p>"
  168. *)
  169. (*
  170.     set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
  171. <pre>
  172. code line #a
  173. code line #b
  174. code line #c
  175. </pre><p> </p><p>\"Avoid phishing emails, fake 'virus' alerts, phony support calls, and other scams\"
  176. <pre>
  177. code line #1
  178. code line #2
  179. code line #3
  180. </pre>
  181. </p><p> </p><p>Run etrecheck.</p> "
  182.  
  183. *)
  184. (*
  185.     set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
  186. <p>\"Effective defenses against malware and other threats\" by John Galt
  187. <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
  188. defenses against malware and ot… - Apple Community</a>
  189. </p><pre>
  190. code line #a
  191. code line #b
  192. code line #c
  193. </pre><p> </p><p>\"Avoid phishing emails, fake 'virus' alerts, phony support calls, and other scams\"
  194. <a href=\"https://support.apple.com/en-ca/HT204759\" target=\"_blank\">Avoid phishing emails, fake
  195. 'virus' alerts, phony support calls, and other scams - Apple Support</a>
  196. <pre>
  197. code line #1
  198. code line #2
  199. code line #3
  200. </pre>
  201. </p><p> </p><p></p>Run etrecheck.  The first five runs are free. Provided a report on your
  202. machines hardware and software.  Great for diagnosing your system.  Click on the download
  203. link at the bottom of the screen.
  204. <a href=\"http://etrecheck.com/\" target=\"_blank\">EtreCheck</a></p><p></p>
  205. <p></p><p>
  206. <ol>
  207. <li>point 1</li>
  208. <li>point 2</li>
  209. <li>point 3</li>
  210. </ol>
  211. </p>
  212. <p>the end</p>"
  213. *)
  214.  
  215. -- Gets invoked here when you run in AppleScript editor or double click on the app icon.
  216. on run
  217.     global debug
  218.    
  219.     -- Write a message into the event log.
  220.     log "  --- Starting on " & ((current date) as string) & " --- "
  221.    
  222.     set debug to 4
  223.     -- 0 no debugging
  224.     -- 1 displays input and output to this routine
  225.     -- 2 moderate
  226.     -- 3 display important "on" blocks
  227.     -- 4 display end of important "on" blocks
  228.     -- 5 display minor "on" blocks
  229.     -- 6 intense
  230.     set lf to character id 10
  231.    
  232.     if debug ≥ 3 then log "in --- run ---"
  233.    
  234.    
  235.     (*
  236.     set the clipboard to "<html><p>As you are using a non-Apple app to access your email or other  facilities, you are now required to use an 'app-specific' password in place of your normal iCloud password. In order to do this you need to set up two-factor authentication for your Apple ID, and for this you need to have either a Mac running El Capitan or above, or an iOS device running iOS9 or above.</p><p> </p><p><a href=\"https://support.apple.com/HT204915\" target=\"_blank\">Two-factor authentication for Apple ID - Apple Support</a></p><p> </p><p><a href=\"https://support.apple.com/HT204397\" target=\"_blank\">Using app-specific passwords - Apple Support</a></p><p> </p><p>If you are unable to set up two-factor authentication you should set up 2-step \"
  237.    
  238.    
  239.  
  240.  
  241. Run etrecheck.  The
  242. first five runs are free. Provided a report on your
  243. machines hardware and software.  Great for diagnosing your system.  Click on the download
  244. link at the bottom of the screen.
  245. <a href=\"http://etrecheck.com/\" target=\"_blank\">EtreCheck</a></p><p></p>
  246. <p></p><p>
  247. <ol>
  248. <li>point 1</li>
  249. <li>point 2</li>
  250. <li>point 3</li>
  251. </ol>
  252. </p>
  253. <p>the end</p>
  254. "
  255.     *)
  256.     (*  set the clipboard to "<p>Simple put, Apple attempts
  257. to provide all the
  258. malware detection and removal you need in Mac OS X.</p>
  259. <p></p><p></p><p></p>
  260. <p>\"Effective defenses against malware and other threats\" by John Galt
  261. <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
  262. defenses against malware and ot… - Apple Community</a>
  263. </p><pre>
  264. code line #a
  265. code line #b
  266. code line #c
  267. </pre><p> </p><p>\"Avoid phishing emails, fake 'virus' alerts, phony support calls, and other scams\"
  268. <a href=\"https://support.apple.com/en-ca/HT204759\">Avoid phishing emails, fake
  269. 'virus' alerts, phony support calls, and other scams - Apple Support</a>
  270. <pre>
  271. code line #1
  272. code line #2
  273. code line #3
  274. </pre>"
  275. *)
  276.     set theList to clipboard info
  277.     if debug ≥ 2 then printClipboardInfo(theList)
  278.    
  279.     set cbInfo to get (clipboard info) as string
  280.    
  281.     -- Most likely, if we have HTML data in the clipboard it will be from a web browser or Word.
  282.     if cbInfo contains "HTML" then
  283.        
  284.         if debug ≥ 2 then log "Working with HTML Class data from clipboard."
  285.         set theBoard to the clipboard as «class HTML»
  286.        
  287.         set normalHtml to do shell script "osascript -e 'try' -e 'get the clipboard as «class HTML»' -e 'end try' | awk '{sub(/«data HTML/, \"\") sub(/»/, \"\")} {print}' | xxd -r -p "
  288.         if debug ≥ 1 then
  289.             log "...Print out plain text version of inputed HTML data from the clipboard..." & return & normalHtml
  290.             hexDumpFormatOne("after converting to printable, normalHtml", normalHtml)
  291.         end if
  292.        
  293.         -- don't let Windoze confuse us. convert Return LineFeed to lf
  294.         set normalHtml to alterString(normalHtml, return & lf, lf)
  295.         -- might as will convert classic macOS return to lf. We will have to look for less things.
  296.         set normalHtml to alterString(normalHtml, return, lf)
  297.         if debug ≥ 2 then hexDumpFormatOne("after altering line ends to LFs normalHtml", normalHtml)
  298.        
  299.         set returnedData to adjustBrowserHTML(normalHtml)
  300.         if debug ≥ 2 then
  301.             log "...Print out plain text version of adjusted HTML data ..." & return & returnedData
  302.             log "...just printed plain text version"
  303.             log "printed in hex"
  304.             hexDumpFormatOne("returnedData", returnedData)
  305.         end if
  306.        
  307.         set returnedData to convertToHTML(returnedData)
  308.         try
  309.             if debug ≥ 2 then log "returnedData is " & returnedData
  310.         on error errStr number errorNumber
  311.             log "===> We didn't find HTML data.   errStr is " & errStr & " errorNumber is " & errorNumber
  312.             return 1
  313.         end try
  314.     else
  315.         -- will work with a plain html or plain text.
  316.         try
  317.             if debug ≥ 2 then log "Working with plain html or plain text"
  318.             set clipboardData to (the clipboard as text)
  319.             if debug ≥ 2 then
  320.                 log "class clipboardData is " & class of clipboardData
  321.                 log "continuing plain html or plain text"
  322.             end if
  323.            
  324.             if debug ≥ 1 then
  325.                 log "inputted clipboardData is " & clipboardData
  326.                 hexDumpFormatOne("inputted clipboardData", clipboardData)
  327.             end if
  328.         on error errStr number errorNumber
  329.             log "===> We didn't find data on the clipboard.   errStr is " & errStr & " errorNumber is " & errorNumber
  330.             display dialog "We didn't find HTML source code nor plain text on the clipboard." & return & "Please copy from a different source." giving up after 15
  331.             return 1
  332.         end try
  333.         if debug ≥ 2 then log "calling common"
  334.         set returnedData to common(clipboardData)
  335.     end if
  336.     if debug ≥ 2 then log "place on the clipboard returnedData is " & returnedData
  337.     postToCLipboard(returnedData)
  338.     -- return code
  339.     return 0
  340.    
  341. end run
  342.  
  343. -- Folder actions.
  344. -- Gets invoked here when something is dropped on the folder that this script is monitoring.
  345. -- Right click on the folder to be monitored. services > Folder Action Settup...
  346. on adding folder items to this_folder after receiving added_items
  347.     -- Write a message into the event log.
  348.     log "  --- Starting on " & ((current date) as string) & " --- "
  349.     display dialog "TBD, some assembly required."
  350. end adding folder items to
  351.  
  352.  
  353.  
  354. -- Gets invoked here when something is dropped on this AppleScript icon
  355. on open dropped_items
  356.     global debug
  357.     set debug to 2
  358.     -- Write a message into the event log.
  359.     log "  --- Starting on " & ((current date) as string) & " --- "
  360.     if debug ≥ 3 then log "in --- open ---"
  361.    
  362.     (*
  363.     -- Debug code.
  364.       set fileName to choose file with prompt "get file"
  365.       set dropped_items to {fileName}
  366.     *)
  367.     if debug ≥ 2 then log "class of dropped_items is " & class of dropped_items
  368.     display dialog "You dropped " & (count of dropped_items) & " item or items." & return & "  Caveat emptor. You have been warned." giving up after 6
  369.    
  370.     set totalFileData to ""
  371.     repeat with droppedItem in dropped_items
  372.         if debug ≥ 2 then
  373.             log "The droppedItem is "
  374.             -- display dialog "processing file " & (droppedItem as string) giving up after 3
  375.             log droppedItem
  376.             log "class = " & class of droppedItem
  377.         end if
  378.         set extIs to findExtension(droppedItem)
  379.         set extIsU to makeCaseUpper(extIs)
  380.         if extIsU is "HTML" or extIsU is "HTM" or extIsU is "TEXT" or extIsU is "TXT" then
  381.             try
  382.                 set theFile to droppedItem as string
  383.                 set theFile to open for access file theFile
  384.                 set allOfFile to read theFile
  385.                 close access theFile
  386.             on error theErrorMessage number theErrorNumber
  387.                 log "==> " & theErrorMessage & "error number " & theErrorNumber
  388.                 close access theFile
  389.             end try
  390.             if debug ≥ 2 then printHeader("read from file ( allOfFile )", allOfFile)
  391.             set totalFileData to totalFileData & common(allOfFile)
  392.         else
  393.             -- we do not support this extension
  394.             display dialog "We only support files with extenstion of html, htm, text or txt in either case. Your file had a " & extIs & " extention. Skipping" giving up after 10
  395.         end if
  396.     end repeat
  397.    
  398.     postToCLipboard(totalFileData)
  399.     -- return code
  400.     return 0
  401.    
  402. end open
  403.  
  404.  
  405. -- ------------------------------------------------------
  406. on common(clipboardData)
  407.     global debug
  408.     if debug ≥ 3 then log "in --- common ---"
  409.     set ht to character id 9
  410.     set lf to character id 10
  411.    
  412.     set cbInfo to get (clipboard info) as string
  413.     (*
  414.     Symbol  Meaning                 Hex     Used
  415.         CR      Carriage Return         0d      classic Macintosh
  416.         LF      Line Feed                       0a      UNIX
  417.         CR/LF   Carriage Return/Line Feed   0d0a    MS-DOS, Windows, OS/2
  418.  
  419.     *)
  420.     -- for some crazy reason, I found hex "090a" (HT LF) in a html file.
  421.     set clipboardData to alterString(clipboardData, ht & lf, lf)
  422.     -- don't let Windoze confuse us. convert Return LineFeed to lf
  423.     set clipboardData to alterString(clipboardData, return & lf, lf)
  424.     -- might as will convert classic macOS return to lf. We will have to look for less things.
  425.     set clipboardData to alterString(clipboardData, return, lf)
  426.     if debug ≥ 2 then hexDumpFormatOne("change various line ends to a LF. clipboardData", clipboardData)
  427.    
  428.     -- figure out what type of data we have: plain text or html source code text.
  429.     set paraCount to count of textToList(clipboardData, "<p")
  430.     set endparaCount to count of textToList(clipboardData, "</p>")
  431.     set titleCount to count of textToList(clipboardData, "<title")
  432.     set endTitleCount to count of textToList(clipboardData, "</title>")
  433.     set aLinkCount to count of textToList(clipboardData, "href=\"http")
  434.     -- mangled href="http
  435.     set mangledLinkCount to count of textToList(clipboardData, "href=\\\"http")
  436.     set brCount to count of textToList(clipboardData, "<br>")
  437.     if debug ≥ 2 then
  438.         log "Values used to distinguis HTML source code from plain text."
  439.         log "paraCount  is " & paraCount
  440.         log "endparaCount is " & endparaCount
  441.         log "titleCount is " & titleCount
  442.         log "endTitleCount is " & endTitleCount
  443.         log "aLinkCount is " & aLinkCount
  444.         log "brCount is " & brCount
  445.         log "mangledLinkCount is " & mangledLinkCount
  446.     end if
  447.    
  448.     -- note, textToList returns a count one greater than the actual because item one is the data before the first found entry.
  449.     if paraCount ≥ 4 and endparaCount ≥ 3 or brCount ≥ 4 or ((titleCount is endTitleCount) and titleCount ≥ 2) or aLinkCount ≥ 3 or mangledLinkCount ≥ 3 then
  450.         -- ASC tends to convert line-ends to either <p></p> or <p><br></p>. Isn't desireable for HTML input
  451.         if debug ≥ 2 then log "... found HTML input ... (in plain text format )."
  452.         set clipboardData to adjustBrowserHTML(clipboardData)
  453.        
  454.     else
  455.         if debug ≥ 2 then log "... found plain Text input ..."
  456.         set clipboardData to typeText(clipboardData)
  457.     end if
  458.     set readyData to convertToHTML(clipboardData)
  459.     if debug ≥ 4 then log "bye from  -.- common -.-"
  460.     return readyData
  461. end common
  462.  
  463. -- ------------------------------------------------------  
  464. (* add paragraphs *)
  465. on addParagraphs(theOutputBuffer)
  466.     global debug
  467.     if debug ≥ 3 then log "in --- addParagraphs ---"
  468.     set lf to character id 10
  469.    
  470.     -- start the theOutputBuffer with a paragraph tag.  We are taking a simple approach at this time.
  471.     set theOutputBuffer to "<p>" & theOutputBuffer
  472.     --  LF
  473.     -- Remember CRLF was changed to LF above and CR was chanaged to LF above.
  474.     -- we don't want no Windoze problems
  475.     set theOutputBuffer to alterString(theOutputBuffer, lf & lf, "</p><p> </p><p>")
  476.    
  477.     -- Does the string end with a dangling paragraph?  
  478.     if debug ≥ 5 then
  479.         log "length of theOutputBuffer is " & length of theOutputBuffer
  480.     end if
  481.     if (length of theOutputBuffer) > (length of "</p>") then
  482.         if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is "<p>" then
  483.             set theOutputBuffer to text 1 thru ((length of theOutputBuffer) - 3) of theOutputBuffer
  484.         else if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is not "</p>" then
  485.             set theOutputBuffer to theOutputBuffer & "</p>"
  486.         end if
  487.     end if
  488.     if debug ≥ 4 then log "bye from  -.- addParagraphs -.-"
  489.     return theOutputBuffer
  490. end addParagraphs
  491.  
  492. -- ------------------------------------------------------
  493. (*
  494.   We received HTML class data on the clipboard.  This is the manager.
  495.   At this point, we expect only LFs in the text.
  496.  *)
  497. on adjustBrowserHTML(normalHtml)
  498.     global debug
  499.     if debug ≥ 3 then log "in --- adjustBrowserHTML ---"
  500.     set lf to character id 10
  501.    
  502.     set alteredHTML to adjustURLs(normalHtml)
  503.     set alteredHTML to adjustToAscHTML(alteredHTML)
  504.     if debug ≥ 4 then log "bye from  -.- adjustBrowserHTML -.-"
  505.     return alteredHTML
  506. end adjustBrowserHTML
  507.  
  508. -- ------------------------------------------------------
  509. on adjustLF(theBuffer)
  510.     global debug
  511.     set lf to character id 10
  512.     if debug ≥ 3 then log "in --- adjustLF() ---"
  513.     if debug ≥ 2 then hexDumpFormatOne("input from theBuffer", theBuffer)
  514.     set numberOfLf to 1 -- for debuggin so we can display loop count
  515.    
  516.     set inputLfBuffer to theBuffer -- now, input data
  517.     set outputBuildLf to "" -- output data
  518.     -- copy & change
  519.     -- ditch leading LFs
  520.     repeat while length of inputLfBuffer ≥ 2 and text 1 thru 1 of inputLfBuffer is lf
  521.         if debug ≥ 2 then log "found leading lf. current length of inputLfBuffer is " & getIntegerAndHex(length of inputLfBuffer)
  522.         -- just lob off LF
  523.         set inputLfBuffer to text 2 thru -1 of inputLfBuffer
  524.        
  525.         log "next text character is " & text 1 thru 1 of inputLfBuffer
  526.     end repeat
  527.     (*
  528.     -- not sure that this is helpful.
  529.     -- preserve spacing
  530.     if debug ≥ 2 then log "converting 4, 3 and 2 LFs to <p>'s."
  531.     set inputLfBuffer to alterString(inputLfBuffer, lf & lf & lf & lf, "<p><br></p><p>~3~</p><p><br></p>")
  532.     set inputLfBuffer to alterString(inputLfBuffer, lf & lf & lf, "<p><br></p><p>~2~</p>")
  533.     set inputLfBuffer to alterString(inputLfBuffer, lf & lf, "<p><br></p><p>~1~</p>")
  534.     if debug ≥ 2 then hexDumpFormatOne("after ditching and changing multiple LFs inputLfBuffer", inputLfBuffer)
  535.     *)
  536.    
  537.     repeat until inputLfBuffer is ""
  538.        
  539.         set whereLfOffset to offset of lf in inputLfBuffer
  540.         if debug ≥ 2 then log "whereLfOffset is " & whereLfOffset & " in hex " & integerToHex(whereLfOffset)
  541.        
  542.         -- get before and after characters if present.
  543.         if whereLfOffset ≥ 2 then
  544.             set priorCharacter to (text (whereLfOffset - 1) thru (whereLfOffset - 1) in inputLfBuffer)
  545.         else
  546.             set priorCharacter to ""
  547.         end if
  548.         if whereLfOffset ≥ (length of inputLfBuffer) then
  549.             -- no following character
  550.             set followingCharacter to ""
  551.         else
  552.             set followingCharacter to (text (whereLfOffset + 1) thru (whereLfOffset + 1) in inputLfBuffer)
  553.         end if
  554.         if debug ≥ 2 then log "priorCharacter is >" & priorCharacter & "< followingCharacter is >" & followingCharacter & "<"
  555.        
  556.         -- process the LF. 
  557.         if (whereLfOffset is 1) and ((length of inputLfBuffer)2) then
  558.             set inputLfBuffer to text 2 thru -1 of inputLfBuffer
  559.             if debug ≥ 2 then log "leading lf.  Got rid of it."
  560.             -- nothing to move to outputBuildLf            
  561.         else if (whereLfOffset is 1) and ((length of inputLfBuffer) is 1) then
  562.             -- we have found all theLFs to find.
  563.             set inputLfBuffer to ""
  564.             if debug ≥ 2 then log "only one character left.  Got rid of it."
  565.         else if followingCharacter is "" then
  566.             if debug ≥ 2 then log "null"
  567.             -- didn't we just check this? Yes, but we need to iterate somehow.
  568.             set {inputLfBuffer, outputBuildLf} to trimLf(inputLfBuffer, outputBuildLf, whereLfOffset, " ")
  569.             -- just skip it, so we don't have to put anything on outputBuildLf
  570.         else if priorCharacter is ">" then
  571.             --  LF after HTML tag. no real need for lf here.  asc tends to make these into <p></p>
  572.             if debug ≥ 2 then log "found a tag"
  573.             -- copy prior stuff
  574.             set {inputLfBuffer, outputBuildLf} to trimLf(inputLfBuffer, outputBuildLf, whereLfOffset, "")
  575.         else if followingCharacter is lf then
  576.             -- prevent double LFs.
  577.             if debug ≥ 2 then log "prevent double LFs at" & getIntegerAndHex(whereLfOffset)
  578.             set {inputLfBuffer, outputBuildLf} to trimLf(inputLfBuffer, outputBuildLf, whereLfOffset, "")
  579.             -- middle of text
  580.         else if (whereLfOffset < (length of inputLfBuffer)) and followingCharacter is " " then
  581.             -- we need to avoid double blanks
  582.             -- purge
  583.             if debug ≥ 2 then log "getting rid of lf at " & getIntegerAndHex(whereLfOffset)
  584.             -- skip lf.
  585.             set {inputLfBuffer, outputBuildLf} to trimLf(inputLfBuffer, outputBuildLf, whereLfOffset, "")
  586.         else
  587.             -- assume there are character before and after the LF.
  588.             if debug ≥ 2 then log "punt."
  589.             -- replace with blank
  590.             set {inputLfBuffer, outputBuildLf} to trimLf(inputLfBuffer, outputBuildLf, whereLfOffset, " ")
  591.         end if
  592.        
  593.         if debug ≥ 2 then
  594.             hexDumpFormatOne("outputBuildLf of " & numberOfLf, outputBuildLf)
  595.             hexDumpFormatOne("inputLfBuffer of " & numberOfLf, inputLfBuffer)
  596.         end if
  597.         -- next pass will be
  598.         set numberOfLf to numberOfLf + 1
  599.     end repeat
  600.     if debug ≥ 4 then log "bye from  -.- adjustLF() -.-"
  601.     return outputBuildLf
  602. end adjustLF
  603.  
  604. -- ------------------------------------------------------
  605. (* ASC likes to insert lots of white space into a page.
  606.   This routine attempt to fix up the html to avoid
  607.   all the extra white-space.
  608.  
  609.    Minimize the amount of white space inserted.
  610.  *)
  611.  
  612. on adjustToAscHTML(ascHtml)
  613.     global debug
  614.     if debug ≥ 3 then log "in --- adjustToAscHtml ---"
  615.     set lf to character id 10
  616.     set numberOfPres to 1
  617.     -- In the context of HTML, LF should mostly be insignificant.
  618.     -- Would be bad to change a LF inside the <pre>  tag.
  619.     --skip changing lf in "<pre>.  
  620.     set buildHtml to "" -- will contain the output
  621.     if debug ≥ 2 then log "find <pre>s"
  622.     -- copy & change
  623.     if (offset of "</pre>" in ascHtml) is not 0 then
  624.         repeat while (offset of "</pre>" in ascHtml) is not 0
  625.             -- get text before "<pre" tag
  626.             set splitString to item 1 of splitTextToList(ascHtml, "<pre")
  627.             if debug ≥ 2 then
  628.                 log "splitString is " & splitString
  629.                 hexDumpFormatOne("buildHtml *before* adjustLF()", buildHtml)
  630.             end if
  631.             set buildHtml to buildHtml & adjustLF(splitString)
  632.             hexDumpFormatOne("buildHtml after adjustLF()", buildHtml)
  633.            
  634.             -- lob off header text we processed
  635.             -- while we found the text before "<pre", we still need to get it out
  636.             -- of ascHtml
  637.             --  & gets rid of the token ("<pre"), so fix
  638.             set ascHtml to "<pre" & chompLeftAndTag(ascHtml, "<pre")
  639.            
  640.             -- any more <pre> tags?
  641.             if ascHtml is "" then
  642.                 display dialog "HTML missing </pre> tag. possible logic error." giving up after 10
  643.                 -- none. We have already adjusted buildHtml
  644.                 exit repeat -- ------ done processing ascHtml ------>
  645.             end if
  646.             if debug ≥ 2 then hexDumpFormatOne("remaining ascHtml is ", ascHtml)
  647.            
  648.             -- tack on the unaltered <pre>..</pre> stuff
  649.             set buildHtml to buildHtml & (item 1 of splitTextToList(ascHtml, "</pre>")) & "</pre>"
  650.             if debug ≥ 2 then hexDumpFormatOne("buildHtml after finding </pre>", buildHtml)
  651.            
  652.             set ascHtml to chompLeftAndTag(ascHtml, "</pre>")
  653.             if debug ≥ 2 then hexDumpFormatOne("ascHtml end of " & numberOfPres & " pass", ascHtml)
  654.             set numberOfPres to numberOfPres + 1
  655.            
  656.         end repeat
  657.         -- remainder
  658.         set buildHtml to buildHtml & adjustLF(ascHtml)
  659.         set ascHtml to ""
  660.     else
  661.         -- lf's are only signigicant in <pre>...</pre>
  662.         if debug ≥ 2 then log "didn't find a <pre>"
  663.         -- all others are white space.
  664.         set buildHtml to adjustLF(ascHtml)
  665.         set ascHtml to "" -- input text processed
  666.     end if
  667.    
  668.    
  669.     (*
  670.     Hack about to fix ASC interpretation of HTML.
  671.    
  672.     ASC alters the definition of a paragraph to have not space before or after the paragraph.
  673.     A paragraph like <p></p> works like a <br>.
  674.    
  675.     Consequently, ASC converts <p> </p> to <p><br></p>, that is a
  676.     space only paragraph to a paragraph with a <br> in it.
  677.    
  678.     the code converts one tag on a line to a line of tags.
  679.     </ol>
  680.    </p>
  681.    <p>
  682.     converted form
  683.     </ol></p><p>
  684.    
  685.     so that means a change on </ol></p><p> converts both the multi-lines form and the single line form.
  686.    
  687.     *)
  688.     set buildHtml to alterString(buildHtml, "<br> ", "<br>")
  689.     -------------------- failure???
  690.     --set buildHtml to alterString(buildHtml, "<p> ", "<p>")
  691.    
  692.     -- asc paragraphs don't generate space before and after the paragraph.
  693.     set buildHtml to alterString(buildHtml, "</p><p></p><p></p>", "</p><p> </p><p></p>")
  694.    
  695.     set buildHtml to alterString(buildHtml, "</p><p></p><p></p>", "</p><p> </p><p></p>")
  696.    
  697.     set buildHtml to alterString(buildHtml, "</ol></p><p>", "</ol><p> </p></p><p>")
  698.     (*
  699.     surprisingly ASC converts <p> </p> to <p><br></p>, that is a
  700.     space only paragraph to a paragraph with a <br> in it.
  701.    
  702.     the code converts one tag on a line to a line of tags.
  703.     </ol>
  704.     </p>
  705.     <p>
  706.     converted form
  707.     </ol></p><p>
  708.    
  709.     so that means a change on </ol></p><p> converts both the multi-lines form and the single line form.
  710.    
  711.     *)
  712.     --set buildHtml to alterString(buildHtml, "<p> </p>", "<p></p>")
  713.     if debug ≥ 2 then hexDumpFormatOne("complete buildHtml ", buildHtml)
  714.     if debug ≥ 4 then log "bye from  -.- adjustToAscHTML -.-"
  715.     return buildHtml
  716. end adjustToAscHTML
  717.  
  718. -- ------------------------------------------------------
  719. (*
  720. example:
  721.   Free version of Parallels for individual use:</p><p><br></p>
  722.   <p>https://itunes.apple.com/us/app/parallels-desktop-lite/id1085114709?mt=12</p>
  723.   <p><br></p>
  724.   <p>Full version</p><p><a href="http://www.parallels.com/en/products/desktop/" target="_blank">
  725.      http://www.parallels.com/en/products/desktop/</a>
  726.      
  727. If asc find a URL outside of an a tag, it will place blank lines around the URL. No, it will not go the
  728. full nine yards and place an a tag around the url.
  729.  
  730. *)
  731. on adjustURLs(theOriginalInputBuffer)
  732.     global debug
  733.     if debug ≥ 3 then log "in --- adjustURLs ---"
  734.     set alteredBuffer to false
  735.     set lf to character id 10
  736.     set theInputBuffer to theOriginalInputBuffer
  737.     if debug ≥ 2 then hexDumpFormatOne("theInputBuffer", theInputBuffer)
  738.    
  739.     -- we end up in a lot of grief when the buffer ends without
  740.     -- a line-end
  741.     if text (length of theInputBuffer) thru (length of theInputBuffer) of theInputBuffer is not lf then
  742.         -- tack LF at the end
  743.         set alteredBuffer to true
  744.         set theInputBuffer to theInputBuffer & lf
  745.         if debug ≥ 2 then hexDumpFormatOne("theInputBuffer", theInputBuffer)
  746.     end if
  747.    
  748.     set buildHtml to ""
  749.     if debug ≥ 5 then log "buildHTML [ should be empty string ] is " & buildHtml
  750.     set countI to 1 -- variable is used for debuging.
  751.     -- do until we have processed theInputBuffer
  752.     repeat until theInputBuffer is ""
  753.         if debug ≥ 2 then log "at the top of theInputBuffer ........."
  754.        
  755.         set foundWhere to {}
  756.         repeat with lookCharacters in {"https://", "http://", "<a "}
  757.             copy (offset of lookCharacters in theInputBuffer) to the end of the foundWhere
  758.             try
  759.                 set tempLoc to (offset of lookCharacters in theInputBuffer)
  760.                 if debug ≥ 2 then log "searching for " & lookCharacters & " found at offset  " & tempLoc & " contains " & text tempLoc thru (tempLoc + ((length of lookCharacters) - 1)) of theInputBuffer
  761.             end try
  762.         end repeat
  763.         if debug ≥ 2 then log foundWhere
  764.         set foundMarkerOffset to (minimumPositiveNumber from foundWhere)
  765.         -- figure out what type of marker we got?
  766.        
  767.         -- None.  Reached the end of the data without finding one.
  768.         if foundMarkerOffset ≤ 0 then
  769.             -- we are done
  770.             if debug ≥ 2 then log "Found all links."
  771.             set buildHtml to buildHtml & theInputBuffer
  772.             if debug ≥ 2 then printHeader("buildHTML", buildHtml)
  773.             set theInputBuffer to ""
  774.             exit repeat -- ------ done processing theInputBuffer ------>
  775.         end if
  776.        
  777.         -- find which of three markers we found.
  778.         if (text foundMarkerOffset thru (foundMarkerOffset + 2) of theInputBuffer) is "<a " then
  779.             set actualMarker to "<a "
  780.         else if text foundMarkerOffset thru (foundMarkerOffset + 6) of theInputBuffer is "http://" then
  781.             set actualMarker to "http://"
  782.         else
  783.             -- just assume it's the remaining "https://" since we looked for just three.
  784.             set actualMarker to "https://"
  785.         end if
  786.         set actualMarkerOffsetLength to ((length of actualMarker) - 1)
  787.         if debug ≥ 2 then
  788.             log "actualMarker is " & actualMarker & " actualMarkerOffsetLength is " & actualMarkerOffsetLength
  789.             log "foundMarkerOffset is " & getIntegerAndHex(foundMarkerOffset) & "  verify marker text is " & text foundMarkerOffset thru (foundMarkerOffset + actualMarkerOffsetLength) of theInputBuffer
  790.         end if
  791.        
  792.        
  793.         if foundMarkerOffset ≥ 2 then
  794.             -- collect and strip off characters that are before the marker.
  795.             if debug ≥ 2 then
  796.                 log "buildHTML is " & buildHtml & " length is " & getIntegerAndHex(length of buildHtml)
  797.                 hexDumpFormatOne("theInputBuffer", theInputBuffer)
  798.                 log " (foundMarkerOffset - 1) is " & getIntegerAndHex((foundMarkerOffset - 1))
  799.             end if
  800.             -- get the proceding text
  801.             set buildHtml to buildHtml & text 1 thru (foundMarkerOffset - 1) of theInputBuffer
  802.             if debug ≥ 2 then
  803.                 log "buildHTML is " & buildHtml
  804.                 hexDumpFormatOne("buildHTML", buildHtml)
  805.             end if
  806.            
  807.             -- https://apple.stackexchange.com/a/20135/44531
  808.            
  809.             set theInputBuffer to text foundMarkerOffset thru -1 of theInputBuffer --trim off character before what we found
  810.             if debug ≥ 2 then
  811.                 printHeader("theInputBuffer", theInputBuffer)
  812.                 hexDumpFormatOne("theInputBuffer", theInputBuffer)
  813.             end if
  814.         else
  815.             log "==> no proceeding data."
  816.         end if
  817.        
  818.         repeat 1 times -- interate loop
  819.            
  820.             -- example" the url is also the display text
  821.             -- <a href="https://discussions.apple.com/docs/DOC-8841" target="_blank">https://discussions.apple.com/docs/DOC-8841</a>
  822.             if debug ≥ 2 then hexDumpFormatOne("theInputBuffer", theInputBuffer)
  823.            
  824.             -- check for the <a> tag
  825.             if text 1 thru (length of "<a ") of theInputBuffer is "<a " then
  826.                 -- found <a> tag
  827.                 if debug ≥ 2 then log "processing <a> tag"
  828.                 -- ASC consider a line-end as a <br> when when firefox considers it a blank
  829.                 -- change a possible line-end before an <a> tag to a " "
  830.                 if debug ≥ 2 then hexDumpFormatOne("before lf check buildHTML", buildHtml)
  831.                 if text (length of buildHtml) thru (length of buildHtml) of buildHtml is lf then
  832.                     if debug ≥ 2 then log "we need to delete a line-end before the <a> tag"
  833.                     set buildHtml to text 1 thru ((length of buildHtml) - 1) of buildHtml
  834.                     set buildHtml to buildHtml & " "
  835.                     if debug ≥ 2 then hexDumpFormatOne("after lf deletion buildHTML", buildHtml)
  836.                 end if
  837.                 -- find ending </a> tag
  838.                 set whereEnds to offset of "</a>" in theInputBuffer
  839.                 if whereEnds ≤ 0 then
  840.                     if debug ≥ 2 then log "==> found an error in the HTML.  no ending </a>"
  841.                     set buildHtml to buildHtml & theInputBuffer
  842.                     printHeader("buildHTML", buildHtml)
  843.                     set theInputBuffer to ""
  844.                     display dialog "Found an error in the HTML.  No ending </a>.  Will skip." giving up after 10
  845.                     exit repeat -- ------ next ------>
  846.                 end if
  847.                 set lastOffsetLength to ((length of "</a>") - 1)
  848.                 if debug ≥ 2 then log "lastOffsetLength is " & lastOffsetLength
  849.                 set lastCharacterOffset to whereEnds + lastOffsetLength
  850.                 if debug ≥ 2 then log "lastCharacterOffset is " & lastCharacterOffset
  851.                 -- needs to copy the ending ">"
  852.                 set anchorString to text 1 thru lastCharacterOffset of theInputBuffer
  853.                 -- don't let Windoze confuse us. convert Return LineFeed to lf
  854.                 -- Correct absure ASC bug where there is a line-end in the <a> text.
  855.                 if debug ≥ 2 then hexDumpFormatOne("before adjusting anchorString", anchorString)
  856.                 set anchorString to alterString(anchorString, lf, " ")
  857.                 if debug ≥ 2 then hexDumpFormatOne("anchorString", anchorString)
  858.                 set buildHtml to buildHtml & anchorString
  859.                 if debug ≥ 2 then hexDumpFormatOne("buildHTML", buildHtml)
  860.                 -- https://apple.stackexchange.com/a/20135/44531
  861.                 -- We want first character of the "next" portion of theInputBuffer so add one
  862.                 set theInputBuffer to text (lastCharacterOffset + 1) thru -1 of theInputBuffer --trim out <a>
  863.                 if debug ≥ 2 then hexDumpFormatOne("theInputBuffer", theInputBuffer)
  864.                 -- Web Browsers like Firefox convert a line-end in text to a space.
  865.                 if text 1 thru 1 of theInputBuffer is lf then
  866.                     if (length of theInputBuffer) is 1 then
  867.                         set theInputBuffer to " "
  868.                     else
  869.                         set theInputBuffer to " " & (text 2 thru (length of theInputBuffer) of theInputBuffer)
  870.                         if debug ≥ 2 then hexDumpFormatOne("after lf deletion; theInputBuffer", theInputBuffer)
  871.                     end if
  872.                 end if
  873.                 exit repeat -- ------ next ------>
  874.             end if
  875.            
  876.             -- find the end of the HTML URL by splitting on blank or return
  877.             -- unsafe characters  <blank> " < > # % { } | \ ^ ~ [ ] `
  878.             -- and line-end
  879.             -- while # is listed as unsafe, it does appear in a url as a marker of some sort.
  880.             -- leave it out as an ending character.
  881.             -- https://perishablepress.com/stop-using-unsafe-characters-in-urls/
  882.             -- the end of the clipboard string my end after the url, hence no " ", LF or CR
  883.             -- Rember, CRLF was converted to LF above
  884.             set endsWhere to {}
  885.             -- the end of the url ends with one of the not allowed characters + line-end
  886.             repeat with unsafeCharacter in {" ", "\"", lf, "<", ">", "%", "{", "}", "|", "\\", "^", "~", "[", "]"}
  887.                 copy (offset of unsafeCharacter in theInputBuffer) to the end of the endsWhere
  888.             end repeat
  889.             if debug ≥ 2 then log endsWhere
  890.             set endOfURL to (minimumPositiveNumber from endsWhere) - 1
  891.            
  892.             if debug ≥ 2 then log "endOfURL is " & endOfURL
  893.            
  894.             if endOfURL ≤ 0 then
  895.                 -- We have reached the end of the input
  896.                 set theURL to theInputBuffer
  897.                 set theInputBuffer to ""
  898.             else
  899.                 set theURL to text 1 thru endOfURL of theInputBuffer
  900.                 if debug ≥ 2 then log "from middle theURL is " & theURL
  901.                
  902.                 set theInputBuffer to text (endOfURL + 1) thru -1 of theInputBuffer -- trim off url in front.
  903.             end if
  904.             printHeader("printHeader", theInputBuffer)
  905.             if debug ≥ 1 then log "----------------------- " & theURL & " -----------------------"
  906.             (*
  907.             retrieve the file pointed to by the URL so we can
  908.             get the title. Note: <title> can have attributes.  Example:
  909.                
  910.             <title data-test-page-title="Parallels Desktop Lite on the Mac App Store"
  911.             >‎Parallels Desktop Lite on the Mac App Store</title>
  912.            
  913.             *)
  914.            
  915.             -- Example:
  916.             -- curl --silent --location --max-time 10 <URL>
  917.             set toUnix to "curl --silent --location --max-time 10 " & quoted form of theURL
  918.             if debug ≥ 2 then log "what we will use to retrieve the Url. toUnix  is " & return & "  " & toUnix
  919.             try
  920.                 if debug ≥ 2 then log "reading link file to get title"
  921.                 set fromUnix to do shell script toUnix
  922.                 --log "fromUnix:"
  923.                 if debug ≥ 2 then
  924.                     printHeader("fromUnix", fromUnix)
  925.                     -- may not be working with an HTLM document, so thefound title may be too long or confused.
  926.                     log "how far?..."
  927.                 end if
  928.                 -- there could be some bagage with the <title
  929.                 set actualTagData to tagContent(fromUnix, "<title", "</title>")
  930.                 -- Find what we will actually display in the title.
  931.                 -- Fix up gotchas.             
  932.                 if debug ≥ 2 then log "actualTagData  is " & printHeader("actualTagData", actualTagData)
  933.                 if actualTagData is "" then
  934.                     set actualTagData to theURL
  935.                 else if length of actualTagData > 140 then
  936.                     if debug ≥ 2 then log "length of actualTagData is " & length of actualTagData & "which is too long.  Truncated."
  937.                     set actualTagData to theURL
  938.                     -- curl https://appleid.apple.com returns <title>403 Forbidden</title>
  939.                     -- which is misleading.
  940.                 else if actualTagData contains "403" and actualTagData contains "Forbidden" then
  941.                     set actualTagData to theURL
  942.                 else
  943.                     -- there could be some attributes within the <title> tag.
  944.                     -- or there could not be
  945.                     -- an attribute could have a > in it. ignoring that for now.
  946.                     try
  947.                         -- find where <title ends
  948.                         set whereToEnd to (offset of ">" in actualTagData)
  949.                         if debug ≥ 2 then log "whereToEnd is " & whereToEnd
  950.                         set whereToBegin to whereToEnd + (length of ">")
  951.                         if debug ≥ 2 then log "whereToBegin is " & whereToBegin
  952.                         hexDumpFormatOne("actualTagData", actualTagData)
  953.                         set actualTagData to text whereToBegin thru (length of actualTagData) of actualTagData
  954.                         if debug ≥ 2 then log "actualTagData is " & actualTagData
  955.                     on error theErrorMessage number theErrorNumber
  956.                         log "==>No ending greater than (>) for title. Badly contructed html." & return & "message is " & theErrorMessage & "error number " & theErrorNumber
  957.                         set actualTagData to actualTagData
  958.                         -- no need to repair.  It's not our page.
  959.                     end try
  960.                    
  961.                     -- found line-end in title.  caused confustion.
  962.                     -- note: this is new data and the multiple line-ends have not been
  963.                     -- filtered out.
  964.                     -- some joker had a line-end in the title!
  965.                     set actualTagData to alterString(actualTagData, return & lf, "  ")
  966.                     set actualTagData to alterString(actualTagData, return, " ")
  967.                     set actualTagData to alterString(actualTagData, lf, "  ")
  968.                     if debug ≥ 2 then log "actualTagData has been chanaged which is  " & actualTagData
  969.                     hexDumpFormatOne("actualTagData", actualTagData)
  970.                 end if
  971.             on error errMsg number n
  972.                 if debug ≥ 2 then log "==> Error occured when looking for title. " & errMsg & " with number " & n
  973.                 set actualTagData to theURL
  974.             end try
  975.             -- why the _blank in the <a>?
  976.             set assembled to "<a href=\"" & theURL & "\" target=\"_blank\">" & actualTagData & "</a>"
  977.             if debug ≥ 2 then log "assembled  is " & assembled
  978.            
  979.             if (length of theInputBuffer) ≤ 0 then
  980.                 -- We have reached the end of the input
  981.                 if debug ≥ 2 then log "we have reached the end of the input."
  982.                 set buildHtml to buildHtml & assembled
  983.             else
  984.                 if debug ≥ 2 then log "more input to process"
  985.                 set buildHtml to buildHtml & assembled
  986.             end if
  987.            
  988.             -- wrap up
  989.             --log "transformed text from buildHTML is  " & return & buildHTML
  990.             if debug ≥ 2 then log "#" & countI & " transformed text from buildHTML is  " & return & buildHtml
  991.             -- number of links found
  992.             set countI to countI + 1
  993.            
  994.         end repeat -- used to interate
  995.     end repeat -- processing links in the input text
  996.     if alteredBuffer is true then
  997.         -- chop off the lf we added above.
  998.         set buildHtml to text 1 thru ((length of buildHtml) - 1) of buildHtml
  999.         set alteredBuffer to false -- somewhat redundant
  1000.     end if
  1001.     if debug ≥ 4 then log "bye from  -.- adjustURLs -.-"
  1002.     return the buildHtml
  1003.    
  1004. end adjustURLs
  1005.  
  1006. -- ------------------------------------------------------
  1007. (*
  1008. alterString
  1009.  thisText is the input string to change
  1010.  delim is what string to change.  It doesn't have to be a single character.
  1011.   replacement is the new string
  1012.  
  1013.   returns the changed string.
  1014. *)
  1015.  
  1016. on alterString(thisText, delim, replacement)
  1017.     global debug
  1018.     if debug ≥ 5 then log "in ~~~ alterString ~~~"
  1019.     set resultList to {}
  1020.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  1021.     try
  1022.         set resultList to every text item of thisText
  1023.         set text item delimiters to replacement
  1024.         set resultString to resultList as string
  1025.         set my text item delimiters to tid
  1026.     on error
  1027.         set my text item delimiters to tid
  1028.     end try
  1029.     return resultString
  1030. end alterString
  1031.  
  1032. -- ------------------------------------------------------
  1033. (*
  1034.   Return the text to the right of theToken.
  1035. *)
  1036. on answerAndChomp(theString, theToken)
  1037.     global debug
  1038.     if debug ≥ 5 then log "in ~~~ answerAndChomp ~~~"
  1039.     set debugging to false
  1040.     set theOffset to offset of theToken in theString
  1041.     if debug ≥ 6 then log "theOffset is " & theOffset
  1042.     set theLength to length of theString
  1043.     if theOffset > 0 then
  1044.         set beginningPart to text 1 thru (theOffset - 1) of theString
  1045.         if debug ≥ 6 then log "beginningPart is " & beginningPart
  1046.        
  1047.         set chompped to text theOffset thru theLength of theString
  1048.         if debug ≥ 6 then log "chompped is " & chompped
  1049.         return {chompped, beginningPart}
  1050.     else
  1051.         set beginningPart to ""
  1052.         return {theString, beginningPart}
  1053.     end if
  1054.    
  1055. end answerAndChomp
  1056.  
  1057. -- ------------------------------------------------------
  1058. (*
  1059.   Delete the leading part of the string until and including theToken.
  1060. *)
  1061. on chompLeftAndTag(theString, theToken)
  1062.     global debug
  1063.     if debug ≥ 6 then log "in --- chompLeftAndTag ---"
  1064.     if debug ≥ 5 then
  1065.         log "theToken is " & theToken
  1066.         hexDumpFormatOne("theString", theString)
  1067.     end if
  1068.     set theOffset to offset of theToken in theString
  1069.     if debug ≥ 5 then log "theOffset is " & theOffset & " in hex is " & integerToHex(theOffset)
  1070.     set theLength to length of theString
  1071.     if debug ≥ 5 then log "theLength is " & theLength & " in hex is " & integerToHex(theLength)
  1072.    
  1073.     if theOffset > 0 then
  1074.         -- Do we have any more of the string to return?
  1075.         if (theOffset + (length of theToken))length of theString then
  1076.             set chompped to text (theOffset + (length of theToken)) thru theLength of theString
  1077.         else
  1078.             set chompped to ""
  1079.         end if
  1080.         if debug ≥ 5 then log "length of chompped is " & length of chompped & "chompped is " & chompped
  1081.         return chompped
  1082.     else
  1083.         return ""
  1084.     end if
  1085. end chompLeftAndTag
  1086.  
  1087. -- ------------------------------------------------------
  1088. on convertToHTML(theData)
  1089.     global debug
  1090.     if debug ≥ 3 then log "in --- convertToHTML ---" & return & "  Try to send back HTML. the processed data in variable theData is " & theData
  1091.     try
  1092.         set clipboardDataQuoted to quoted form of theData
  1093.        
  1094.         if debug ≥ 1 then
  1095.             log "------ data soon to be returned -------" & return & "clipboardDataQuoted is " & return & clipboardDataQuoted
  1096.             hexDumpFormatOne("clipboardDataQuoted", clipboardDataQuoted)
  1097.         end if
  1098.         -- make hex string as required for HTML data on the clipboard
  1099.         set toUnix to "/bin/echo -n " & clipboardDataQuoted & " | hexdump -ve '1/1 \"%.2x\"'"
  1100.         if debug ≥ 5 then printHeader("toUnix to convert to hex", toUnix)
  1101.         set fromUnix to do shell script toUnix
  1102.        
  1103.         if debug ≥ 5 then printHeader("fromUnix", fromUnix)
  1104.        
  1105.         if debug ≥ 5 then
  1106.             log "displaying original string --- so we can tell if it converted successfully. "
  1107.             --hexDumpFormatOne("fromUnix", fromUnix)
  1108.         end if
  1109.     on error errMsg number n
  1110.         log "==> convert to hex string failed. " & errMsg & " with number " & n
  1111.         set fromUnix to ""
  1112.     end try
  1113.     if debug ≥ 4 then log "bye from  -.- convertToHTML -.-"
  1114.     return fromUnix
  1115. end convertToHTML
  1116.  
  1117. -- ------------------------------------------------------  
  1118. (*
  1119. Yvan Koenig
  1120. https://macscripter.net/viewtopic.php?id=43133
  1121. *)
  1122. on findExtension(inputFileName)
  1123.     global debug
  1124.     if debug ≥ 5 then log "in ~~~ indExtension ~~~"
  1125.     set fileName to inputFileName as string
  1126.     set saveTID to AppleScript's text item delimiters
  1127.     set AppleScript's text item delimiters to {"."}
  1128.     set theExt to last text item of fileName
  1129.     set AppleScript's text item delimiters to saveTID
  1130.     --log "theExt is " & theExt
  1131.     if theExt ends with ":" then set theExt to text 1 thru -2 of theExt
  1132.     if debug ≥ 5 then log "theExt is " & theExt
  1133.     return theExt
  1134. end findExtension
  1135.  
  1136. -- ------------------------------------------------------
  1137. (*
  1138. length of inputLfBuffer & " in hex " & integerToHex(length of inputLfBuffer)
  1139. *)
  1140. on getIntegerAndHex(aNumber)
  1141.     global debug
  1142.     if debug ≥ 5 then log "in ~~~ getIntegerAndHex ~~~"
  1143.    
  1144.     return aNumber & " in Hex " & integerToHex(aNumber)
  1145. end getIntegerAndHex
  1146.  
  1147. -- ------------------------------------------------------
  1148. (*
  1149.  http://krypted.com/mac-os-x/to-hex-and-back/
  1150.               0    2    4    6    8    a    c    e     0 2 4 6 8 a c e
  1151. 0000000:   3c 703e 5369 6d70 6c65 2070 7574 2c20   <p>Simple put,
  1152.            *)
  1153. on hexDumpFormatOne(textMessage, hex)
  1154.     global debug
  1155.     set aNul to character id 1
  1156.    
  1157.     if debug ≥ 5 then
  1158.         log "in ~~~ hexDumpFormatOne ~~~"
  1159.         log "input string is " & return & hex
  1160.     end if
  1161.     -- -r -p
  1162.     set displayValue to aNul & hex
  1163.     set toUnix to "/bin/echo -n " & (quoted form of displayValue) & " | xxd  "
  1164.     if debug ≥ 5 then log "toUnix is " & toUnix
  1165.     try
  1166.         set fromUnix to do shell script toUnix
  1167.        
  1168.         set displayText to replaceCharacter(fromUnix, 10, "  ")
  1169.         if debug ≥ 5 then
  1170.             log return & displayText
  1171.             log "length of displayText is " & length of displayText
  1172.         end if
  1173.         set displayText to replaceCharacter(displayText, 51, " ")
  1174.         if debug ≥ 5 then
  1175.             log return & displayText
  1176.             log "almost there ..... length of displayText is " & length of displayText
  1177.         end if
  1178.         log "variable " & textMessage & " in hex is " & return & "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e" & return & displayText
  1179.     on error errMsg number n
  1180.         log "==> convert hex string to string failed. " & errMsg & " with number " & n
  1181.     end try
  1182.     if debug ≥ 5 then
  1183.         log "leaving ~.~ hexDumpFormatOne ~.~"
  1184.     end if
  1185. end hexDumpFormatOne
  1186.  
  1187. -- ------------------------------------------------------
  1188. on hexDumpFormatZero(textMessage, hex)
  1189.     global debug
  1190.     if debug ≥ 5 then log "in ~~~ hexDumpFormatZero ~~~"
  1191.     if debug ≥ 5 then log "input string is " & hex
  1192.     -- -r -p
  1193.     set toUnix to "/bin/echo -n " & (quoted form of hex) & " | xxd  "
  1194.     if debug ≥ 5 then log "toUnix is " & toUnix
  1195.     try
  1196.         set displayText to do shell script toUnix
  1197.        
  1198.         log "variable " & textMessage & " in hex is " & return & "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e" & return & displayText
  1199.     on error errMsg number n
  1200.         log "==> convert hex string to string failed. " & errMsg & " with number " & n
  1201.     end try
  1202. end hexDumpFormatZero
  1203.  
  1204. -- ------------------------------------------------------
  1205. (*
  1206. https://macscripter.net/viewtopic.php?id=43713
  1207.  *)
  1208. on integerToHex(nDec)
  1209.     global debug
  1210.     if debug ≥ 5 then log "in ~~~ integerToHex ~~~"
  1211.     try
  1212.         set nHex to do shell script "perl -e 'printf(\"%X\", " & nDec & ")'" --> "F0"
  1213.     on error errMsg number n
  1214.         log "==> convert integer to hex. " & errMsg & " with number " & n
  1215.         set nHex to ""
  1216.     end try
  1217.     return nHex
  1218. end integerToHex
  1219. -- ------------------------------------------------------
  1220. (*
  1221.  
  1222. https://stackoverflow.com/questions/55838252/minimum-value-that-not-zero
  1223.        set m to get minimumPositiveNumber from {10, 2, 0, 2, 4}
  1224.     log "m is " & m
  1225.     set m to minimumPositiveNumber from {0, 0, 0}
  1226.     log "m is " & m
  1227. *)
  1228. on minimumPositiveNumber from L
  1229.     global debug
  1230.     if debug ≥ 5 then log "in ~~~ minimumPositiveNumber ~~~"
  1231.     local L
  1232.    
  1233.     if L = {} then return null
  1234.    
  1235.     set |ξ| to 0
  1236.    
  1237.     repeat with x in L
  1238.         set x to x's contents
  1239.         if (x < |ξ| and x ≠ 0) ¬
  1240.             or |ξ| = 0 then ¬
  1241.             set |ξ| to x
  1242.     end repeat
  1243.    
  1244.     |ξ|
  1245. end minimumPositiveNumber
  1246.  
  1247. -- ------------------------------------------------------
  1248. (*
  1249.  makeCaseUpper("Now is the time, perhaps, for all good men")
  1250. *)
  1251. on makeCaseUpper(theString)
  1252.     global debug
  1253.     if debug ≥ 5 then log "in ~~~ makeCaseUpper ~~~"
  1254.     set UC to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  1255.     set LC to "abcdefghijklmnopqrstuvwxyz"
  1256.     set C to characters of theString
  1257.     repeat with ch in C
  1258.         if ch is in LC then set contents of ch to item (offset of ch in LC) of UC
  1259.     end repeat
  1260.     return C as string
  1261. end makeCaseUpper
  1262.  
  1263. -- ------------------------------------------------------
  1264. on postToCLipboard(pleasePost)
  1265.     global debug
  1266.     if debug ≥ 5 then log "in ~~~ postToCLipboard ~~~"
  1267.     try
  1268.         -- osascript -e "set the clipboard to «data HTML${hex}»"     
  1269.         set toUnixSet to "osascript -e \"set the clipboard to «data HTML" & pleasePost & "»\""
  1270.         if debug ≥ 5 then log "toUnixSet is " & printHeader("toUnixSet", toUnixSet)
  1271.        
  1272.         set fromUnixSet to do shell script toUnixSet
  1273.         if debug ≥ 5 then log "fromUnixSet is " & fromUnixSet
  1274.        
  1275.     on error errMsg number n
  1276.         log "==> We tried to send back HTML data, but failed. " & errMsg & " with number " & n
  1277.     end try
  1278.     -- see what ended up on the clipboard
  1279.     set theList2 to clipboard info
  1280.     if debug ≥ 2 then printClipboardInfo(theList2)
  1281. end postToCLipboard
  1282.  
  1283. -- ------------------------------------------------------
  1284. on printClipboardInfo(theList)
  1285.     global debug
  1286.     if debug ≥ 5 then log "in ~~~ printClipboardInfo ~~~"
  1287.     log (clipboard info)
  1288.     log class of theList
  1289.     log "Data types on the clipboard ... "
  1290.     printList("", theList)
  1291.     log "... "
  1292. end printClipboardInfo
  1293.  
  1294. -- ------------------------------------------------------
  1295. (* Pump out the beginning of theString *)
  1296. on printHeader(theName, theString)
  1297.     global debug
  1298.     if debug ≥ 5 then
  1299.         log "in ~~~ printHeader ~~~"
  1300.         log "theName is " & theName
  1301.         log theString
  1302.         log "length of theString is " & length of theString
  1303.     end if
  1304.     if length of theString ≤ 0 then
  1305.         log "==> no string to print"
  1306.     else
  1307.         log theName & " is " & return & text 1 thru (minimumPositiveNumber from {400, length of theString}) of theString & "<+++++++++"
  1308.     end if
  1309. end printHeader
  1310.  
  1311. -- ------------------------------------------------------
  1312. (*
  1313. print out the items in a list
  1314.  
  1315. *)
  1316.  
  1317. on printList(theName, splits)
  1318.     global debug
  1319.     if debug ≥ 5 then log "in ~~~ printList ~~~"
  1320.     try
  1321.         set theCount to 1
  1322.         repeat with theEntry in splits
  1323.             --log "class of theEntry is " & class of theEntry
  1324.             set classDisplay to class of theEntry as text
  1325.             --log "classDisplay is " & classDisplay as text
  1326.             --log "class of classDisplay is " & class of classDisplay
  1327.             if classDisplay is "list" then
  1328.                 log "    " & theName & theCount & " is " & item 1 of theEntry & "; " & item 2 of theEntry
  1329.             else
  1330.                 log "    " & theName & theCount & " is " & theEntry
  1331.             end if
  1332.             set theCount to theCount + 1
  1333.         end repeat
  1334.     on error errMsg number n
  1335.         log "==> No go in printList. " & errMsg & " with number " & n
  1336.     end try
  1337. end printList
  1338.  
  1339. -- ------------------------------------------------------
  1340. (*
  1341. StefanK in https://macscripter.net/viewtopic.php?id=43852
  1342. Replaces one or more characters based on the length of theCharacter.
  1343.  
  1344.  Big Warning!!!
  1345.  ==============
  1346.    This on block is called by hexDumpFormatOne().  
  1347.    Therefor, you may not call hexDumpFormatOne() from this on block.
  1348.     If you so so, you get yourself into an endless loop.
  1349.     Use hexDumpFormatZero() instead.
  1350.    
  1351.     script -k <output file name>
  1352.     osascript /Applications/applescriptFiles/workwithclipboardV13-HTML.app
  1353.     use Activity Monito to stop osascript
  1354.    
  1355. *)
  1356.  
  1357. on replaceCharacter(theText, theOffset, theCharacter)
  1358.     global debug
  1359.     if debug ≥ 5 then
  1360.         log "in ~~~ replaceCharacter ~~~"
  1361.         log "  theOffset is " & getIntegerAndHex(theOffset) & " with theCharacter >" & theCharacter & "<  length of theText is " & getIntegerAndHex(length of theText)
  1362.         log "theText is " & theText
  1363.     end if
  1364.    
  1365.     set theOutput to theText -- ready to return if need be.
  1366.     repeat 1 times
  1367.         -- sanity checks
  1368.         if theOffset ≤ 0 then
  1369.             display dialog "No character to replace at " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
  1370.             log "==> Adjust theOffset to be wihin the string."
  1371.             exit repeat -------------- return ---------->                  
  1372.         end if
  1373.         if (theOffset - (length of theCharacter)) ≤ 0 then
  1374.             display dialog "Too near the front of the buffer.  " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
  1375.             log "==> Too near the front of the buffer. "
  1376.             exit repeat -------------- return ---------->
  1377.         end if
  1378.         if (theOffset + (length of theCharacter) - 1) > (length of theText) then
  1379.             display dialog "To near the end of the buffer. " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
  1380.             log "==> Too near the end of the buffer. "
  1381.             log "  " & "theOffset is " & theOffset & " with theCharacter >" & theCharacter & "<  in " & theText
  1382.             log "length of buffer is " & getIntegerAndHex(length of theText)
  1383.             exit repeat -------------- return ---------->                  
  1384.         end if
  1385.        
  1386.         if debug ≥ 6 then
  1387.             log "theOffset is " & getIntegerAndHex(theOffset)
  1388.             log "theCharacter is " & theCharacter
  1389.         end if
  1390.        
  1391.         try
  1392.             -- what if we are at the end of the buffer.  We cannot get any remainder text.
  1393.             if theOffset ≥ (length of theText) then
  1394.                 set theOutput to (text 1 thru (theOffset - 1) of theText) & theCharacter
  1395.             else
  1396.                 set theOutput to (text 1 thru (theOffset - 1) of theText) & theCharacter & (text (theOffset + (length of theCharacter)) thru -1 of theText)
  1397.             end if
  1398.         on error errMsg number n
  1399.             log "==> No go. " & errMsg & " with number " & n
  1400.             exit repeat -------------- return ---------->
  1401.         end try
  1402.     end repeat
  1403.     return theOutput
  1404. end replaceCharacter
  1405.  
  1406. -- ------------------------------------------------------
  1407. (*
  1408. splitTextToList seems to be what you are trying to do
  1409.  thisText is the input string
  1410.  delim is what to split on
  1411.  
  1412.  results returned in a list
  1413.  
  1414.  Total hack. We know splitTextToList strips of delim so add it back.
  1415. *)
  1416.  
  1417. on splitTextToList(thisText, delim)
  1418.     global debug
  1419.     if debug ≥ 5 then log "in ~~~ splitTextToList ~~~"
  1420.    
  1421.     set returnedList to textToList(thisText, delim)
  1422.     set resultArray to {}
  1423.     copy item 1 of returnedList to the end of the resultArray
  1424.    
  1425.     repeat with i from 2 to (count of returnedList) in returnedList
  1426.         set newElement to delim & item i of returnedList
  1427.         copy newElement to the end of the resultArray
  1428.     end repeat
  1429.    
  1430.     return resultArray
  1431. end splitTextToList
  1432.  
  1433. -- ------------------------------------------------------
  1434. (*
  1435.  Retrieved data between "begin" and "end" tag. Whatever is between the strings.
  1436. *)
  1437. on tagContent(theString, startTag, endTag)
  1438.     global debug
  1439.     if debug ≥ 5 then log "in ~~~ tagContent ~~~"
  1440.     try
  1441.         if debug ≥ 5 then log "in tabContent. " & return & "    startTag is " & startTag & " endTag is " & endTag
  1442.         set beginningOfTag to chompLeftAndTag(theString, startTag)
  1443.         if length of beginningOfTag ≤ 0 then
  1444.             set middleText to ""
  1445.         else
  1446.             printHeader("beginningOfTag", beginningOfTag)
  1447.             set endingOffset to (offset of endTag in beginningOfTag)
  1448.            
  1449.             if endingOffset ≤ (length of endTag) then
  1450.                 set middleText to ""
  1451.             else
  1452.                 set middleText to text 1 thru (endingOffset - 1) of beginningOfTag
  1453.                 printHeader("middleText is ", middleText)
  1454.             end if
  1455.         end if
  1456.     on error errMsg number n
  1457.         log "==> finding contained text failed. " & errMsg & " with number " & n
  1458.         set middleText to ""
  1459.     end try
  1460.     if debug ≥ 5 then log "returning with middleText is " & middleText
  1461.     return middleText
  1462. end tagContent
  1463.  
  1464. (*
  1465. textToList seems to be what you are trying to do
  1466.  thisText is the input string
  1467.  delim is what to split on
  1468.  
  1469.  returns a list of strings.  
  1470.  
  1471. - textToList was found here:
  1472. - http://macscripter.net/viewtopic.php?id=15423
  1473.  
  1474. *)
  1475.  
  1476. on textToList(thisText, delim)
  1477.     global debug
  1478.     if debug ≥ 5 then log "in ~~~ textToList ~~~"
  1479.     set resultList to {}
  1480.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  1481.    
  1482.     try
  1483.         set resultList to every text item of thisText
  1484.         set my text item delimiters to tid
  1485.     on error
  1486.         set my text item delimiters to tid
  1487.     end try
  1488.     return resultList
  1489. end textToList
  1490.  
  1491. -- ------------------------------------------------------
  1492. on trimLf(inputLf, outputTrimmed, theLfOffset, substitueCharacter)
  1493.     global debug
  1494.    
  1495.     if debug ≥ 3 then
  1496.         log "in ~~~ trimLf ~~~"
  1497.         hexDumpFormatOne("inputLf", inputLf)
  1498.         hexDumpFormatOne("outputTrimmed", outputTrimmed)
  1499.         log "theLfOffset, is " & getIntegerAndHex(theLfOffset)
  1500.         log "with substitueCharacter >" & substitueCharacter & "<  "
  1501.     end if
  1502.    
  1503.     -- check boundaries
  1504.     if theLfOffset ≤ 0 or (theLfOffset > (length of inputLf)) then
  1505.         -- We are almost done.
  1506.         log "no LF found."
  1507.         -- tack on any trialing stuff
  1508.         set outputTrimmed to outputTrimmed & inputLf
  1509.         set inputLf to ""
  1510.         if debug ≥ 3 then
  1511.             hexDumpFormatOne("inputLf", inputLf)
  1512.             hexDumpFormatOne("outputTrimmed", outputTrimmed)
  1513.         end if
  1514.         return {inputLf, outputTrimmed} ------------ return ------------>
  1515.     end if
  1516.    
  1517.     -- We need to deal with output first, so we haven't trimmed the input we need.
  1518.     if theLfOffset ≥ 2 then
  1519.         set outputTrimmed to outputTrimmed & (text 1 thru (theLfOffset - 1) of inputLf) & substitueCharacter
  1520.     else if theLfOffset = 1 then
  1521.         -- no stuff before the lf
  1522.         set outputTrimmed to outputTrimmed & substitueCharacter
  1523.     end if
  1524.    
  1525.     -- deal with inputLf.
  1526.     if theLfOffset < (length of inputLf) then
  1527.         -- trailing stuff
  1528.         set inputLf to text (theLfOffset + 1) thru -1 of inputLf
  1529.     else if theLfOffset is (length of inputLf) then
  1530.         set inputLf to ""
  1531.     end if
  1532.    
  1533.     if debug ≥ 3 then
  1534.         hexDumpFormatOne("inputLf", inputLf)
  1535.         hexDumpFormatOne("outputTrimmed", outputTrimmed)
  1536.     end if
  1537.     if debug ≥ 4 then log "bye from  -.- trimLf -.-"
  1538.     return {inputLf, outputTrimmed}
  1539.    
  1540. end trimLf
  1541.  
  1542. -- ------------------------------------------------------
  1543. (*
  1544.          Unix-like systems      LF      0A      \n
  1545.             (Linux, macOS)
  1546.                Microsoft Windows    CRLF    0D 0A   \r\n
  1547.                classic Mac OS       CR      0D          \r   Applescript return
  1548.   *)
  1549. on typeText(theData)
  1550.    
  1551.     global debug
  1552.     if debug ≥ 5 then log "in ~~~ typeText ~~~"
  1553.     set lf to character id 1
  1554.    
  1555.     if debug ≥ 2 then printHeader("the input  ( theData )", theData)
  1556.     -- Example: -- https://discussions.apple.com/docs/DOC-8841
  1557.     -- locate links
  1558.    
  1559.     set theOutputBuffer to adjustURLs(theData)
  1560.    
  1561.     -- add paragraphs
  1562.     set theOutputBuffer to addParagraphs(theOutputBuffer)
  1563.    
  1564.     if debug ≥ 2 then log "theOutputBuffer is " & return & theOutputBuffer
  1565.     if debug ≥ 4 then log "bye from  -.- typeText -.-"
  1566.     return theOutputBuffer
  1567. end typeText
  1568.  
  1569.  
  1570.  
  1571. (*
  1572. https://www.oreilly.com/library/view/applescript-the-definitive/0596102119/re89.html
  1573.  
  1574. https://stackoverflow.com/questions/11085654/apple-script-how-can-i-copy-html-content-to-the-clipboard
  1575.  
  1576. -- user has copied a file's icon in the Finder
  1577. clipboard info
  1578. -- {{string, 20}, {«class ut16», 44}, {«class hfs », 80}, {«class
  1579. utf8», 20}, {Unicode text, 42}, {picture, 2616}, {«class icns», 43336},
  1580. {«class furl», 62}}
  1581.  
  1582. textutil -convert html foo.rtf
  1583.  
  1584. if ((clipboard info) as string) contains "«class furl»" then
  1585.         log "the clipboard contains a file named " & (the clipboard as string)
  1586.     else
  1587.         log "the clipboard does not contain a file"
  1588.     end if
  1589.    
  1590. the clipboard       required
  1591. as  class   optional
  1592.  
  1593. tell application "Script Editor"
  1594.         activate
  1595.     end tell
  1596.    
  1597. textutil has a simplistic text to html conversion
  1598.    set clipboardDataQuoted to quoted form of theData
  1599.     log "quoted form is " & clipboardDataQuoted
  1600.    
  1601.     set toUnix to "/bin/echo -n " & clipboardDataQuoted
  1602.     set toUnix to toUnix & " | textutil -convert html -noload -nostore -stdin -stdout "
  1603.     log "toUnix is " & toUnix
  1604.     set fromUnix to do shell script toUnix
  1605.     log "fromUnix  is " & fromUnix
  1606.    
  1607.    
  1608. set s to "Today is my birthday"
  1609. log text 1 thru ((offset of "my" in s) - 1) of s
  1610. --> "Today is "
  1611.             -- text 1 thru ((offset of "my" in s) - 1) of s
  1612.             -- -1 since offset return the first character "m" position count
  1613.            
  1614. log "beginningOfTag is " & text 1 thru (minimumPositiveNumber from {200, length of beginningOfTag}) of beginningOfTag & "<+++++++++++++++++++++++"
  1615.  
  1616. https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_cmds.html
  1617.  
  1618. *)
  1619.  
  1620. --mac $ hex=`echo -n "<p>your html code here</>" | hexdump -ve '1/1 "%.2x"'`
  1621. --mac $ echo $hex
  1622. --3c703e796f75722068746d6c20636f646520686572653c2f3e
  1623. --mac $ osascript -e "set the clipboard to «data HTML${hex}»"
  1624. --mac $
  1625. (*  
  1626. A sub-routine for encoding ASCII characters.  
  1627.  
  1628. encode_char("$")  
  1629. --> returns: "%24"  
  1630.  
  1631. based on:  
  1632. https://www.macosxautomation.com/applescript/sbrt/sbrt-08.html  
  1633.  
  1634. *)
  1635. (*
  1636. Lowest Numeric Value in a List
  1637.  
  1638. This sub-routine will return the lowest numeric value in a list of items. The passed list can contain non-numeric data as well as lists within lists. For example:
  1639.  
  1640. lowest_number({-3.25, 23, 2345, "sid", 3, 67})
  1641. --> returns: -3.25
  1642. lowest_number({-3.25, 23, {-22, 78695, "bob"}, 2345, true, "sid", 3, 67})
  1643. --> returns: -22
  1644.  
  1645. If there is no numeric data in the passed list, the sub-routine will return a null string ("")
  1646.  
  1647. lowest_number({"this", "list", "contains", "only", "text"})
  1648. --> returns: ""
  1649.  
  1650. https://macosxautomation.com/applescript/sbrt/sbrt-03.html
  1651.  
  1652. Here's the sub-routine:
  1653.  
  1654. *)
  1655. (*
  1656. on lowestNumber(values_list)
  1657.     set the low_amount to ""
  1658.     repeat with i from 1 to the count of the values_list
  1659.         set this_item to item i of the values_list
  1660.         set the item_class to the class of this_item
  1661.         if the item_class is in {integer, real} then
  1662.             if the low_amount is "" then
  1663.                 set the low_amount to this_item
  1664.             else if this_item is less than the low_amount then
  1665.                 set the low_amount to item i of the values_list
  1666.             end if
  1667.         else if the item_class is list then
  1668.             set the low_value to lowest_number(this_item)
  1669.             if the the low_value is less than the low_amount then ¬
  1670.                 set the low_amount to the low_value
  1671.         end if
  1672.     end repeat
  1673.     return the low_amount
  1674. end lowestNumber
  1675.  
  1676. https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  1677. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  1678.  
  1679. set minimum to 9.9999999999E+12
  1680. set maximum to 0
  1681. repeat with ref_to_value in list_of_values
  1682.     set the_value to contents of ref_to_value
  1683.     if the_value > maximum then set maximum to the_value
  1684.     if the_value < minimum then set minimum to the_value
  1685. end repeat
  1686.  
  1687. {minimum, maximum}
  1688.  
  1689. may do the trick.
  1690.  
  1691. Yvan KOENIG (VALLAURIS, France) lundi 13 septembre 2010 22:32:41
  1692. *)
  1693. (* https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  1694. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  1695.  
  1696. set minimum to 9.9999999999E+12
  1697.  
  1698. assume it's limited to positive values
  1699.  
  1700.  
  1701. on maxValue(list_of_values)
  1702.     global debug
  1703.     if debug ≥ 5 then log "in maxValue " & return & list_of_values
  1704.     set maximum to 0
  1705.     repeat with ref_to_value in list_of_values
  1706.         set the_value to contents of ref_to_value
  1707.         if the_value > maximum then set maximum to the_value
  1708.     end repeat
  1709.     if debug ≥ 5 then log maximum
  1710.     return maximum
  1711. end maxValue
  1712. *)
  1713. -- ------------------------------------------------------
  1714. (*
  1715. http://harvey.nu/applescript_url_encode_routine.html
  1716.  
  1717. on urlencode(theText)
  1718.     set theTextEnc to ""
  1719.     repeat with eachChar in characters of theText
  1720.         set useChar to eachChar
  1721.         set eachCharNum to ASCII number of eachChar
  1722.         if eachCharNum = 32 then
  1723.             set useChar to "+"
  1724.         else if (eachCharNum ≠ 42) and (eachCharNum ≠ 95) and (eachCharNum < 45 or eachCharNum > 46) and (eachCharNum < 48 or eachCharNum > 57) and (eachCharNum < 65 or eachCharNum > 90) and (eachCharNum < 97 or eachCharNum > 122) then
  1725.             set firstDig to round (eachCharNum / 16) rounding down
  1726.             set secondDig to eachCharNum mod 16
  1727.             if firstDig > 9 then
  1728.                 set aNum to firstDig + 55
  1729.                 set firstDig to ASCII character aNum
  1730.             end if
  1731.             if secondDig > 9 then
  1732.                 set aNum to secondDig + 55
  1733.                 set secondDig to ASCII character aNum
  1734.             end if
  1735.             set numHex to ("%" & (firstDig as string) & (secondDig as string)) as string
  1736.             set useChar to numHex
  1737.         end if
  1738.         set theTextEnc to theTextEnc & useChar as string
  1739.     end repeat
  1740.     return theTextEnc
  1741. end urlencode
  1742.  
  1743. Clipboard classes after a copy from the application.
  1744. from waterfox
  1745. (*«class HTML», 13876, «class utf8», 505, «class ut16», 1012, string, 505, Unicode text, 1010*)
  1746.  
  1747. from chrome
  1748. (*«class HTML», 748, «class utf8», 204, «class ut16», 410, string, 204, Unicode text, 408*)
  1749.  
  1750. from safari
  1751. (*«class weba», 120785, «class RTF », 70255, «class HTML», 122811, «class utf8», 3370, «class ut16», 6772, uniform styles, 47132, string, 3385, scrap styles, 8122, Unicode text, 6732, uniform styles, 47132, scrap styles, 8122*)
  1752.  
  1753. iCab
  1754. (*«class weba», 1665, «class RTF », 763, «class utf8», 121, «class ut16», 244, uniform styles, 376, string, 121, scrap styles, 62, Unicode text, 242, uniform styles, 376, scrap styles, 62*)
  1755.  
  1756. Opera
  1757. (*«class HTML», 5767, «class utf8», 150, «class ut16», 302, string, 150, Unicode text, 300*)
  1758.  
  1759. Textedit
  1760. (*«class RTF », 1136, «class utf8», 138, «class ut16», 278, uniform styles, 148, string, 138, scrap styles, 22, Unicode text, 276, uniform styles, 148, scrap styles, 22*)
  1761.  
  1762. Word
  1763. (*«class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, «class RTF », 30573, «class HTML», 21160, scrap styles, 22, uniform styles, 136, string, 210, Unicode text, 420, «class PDF », 13197, picture, 154058, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232, «class ut16», 422, «class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, scrap styles, 22, uniform styles, 136, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232*)
  1764.  
  1765. TextWrangler
  1766. (*«class utf8», 185, «class BBLM», 4, «class ut16», 372, string, 185, Unicode text, 370, «class BBLM», 4*)
  1767.  
  1768. *)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement