Advertisement
rccharles

ASC asjust html

May 7th, 2019
1,146
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. (*
  2.   This applescript converts clipboard input into a format suited for pasting into an ASC
  3.   reply.  I observed that my copies into an ASC reply were not formated that well.  
  4.   I observed that copies from a web browser were formated much better.  I went about
  5.    adjusting the clipboard copy to the format expected by a web browser for best results.
  6.  
  7.  This applescript accepts the clipboard in either
  8.  -- plan text upon which the text is converted to HTML.  Conversion is limitted to inserting paragraph tags for blank lines and inserting links where http or https text appears. The page title is substituted for the link.  
  9.  -- HTML source code identified by text containing HTML markup.  
  10.          Caveat emptor.  
  11.  
  12.  to use:
  13.  1) copy command + c what data you want to convert
  14.  2) run this applascript by double clicking on the app.
  15.  3) paste command + V into an ASC reply
  16.  
  17.  I have tested in Waterfox 56.2.9 in Yosemite.  I assume the process will work with other web browsers and other versions of macOS.
  18.  
  19.  Save as an Application Bundle.  Don't check any of the boxes.
  20.  
  21. Should you experience a problem, run in the Script Editor.
  22.    Shows how to debug via on run path. Shows items added to folder. Shows log statement.
  23.    It is easier to diagnose problems with debug information. I suggest adding log statements to your script to see what is going on.  Here is an example.  
  24.    
  25.   For testing, run in the Script Editor.
  26.          1) Click on the Event Log tab to see the output from the log statement
  27.       2) Click on Run
  28.    
  29. change log
  30. may 1, 2019 -- skip 403 forbidding title
  31. may 2, 2019 -- convert \" to ".  the \" mysteriously appears in HTML source code input.  Probably some TextEdit artifact.
  32.               copy to TextEdit copy out of TextEdit.          
  33.  
  34. enhancements:
  35.   -- get pdf title
  36.  
  37.  
  38. Author: rccharles
  39.  
  40.  Copyright 2019 rccharles  
  41.      
  42.        Permission is hereby granted, free of charge, to any person obtaining a copy  
  43.        of this software and associated documentation files (the "Software"), to deal  
  44.        in the Software without restriction, including without limitation the rights  
  45.        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell  
  46.        copies of the Software, and to permit persons to whom the Software is  
  47.        furnished to do so, subject to the following conditions:  
  48.        
  49.        The above copyright notice and this permission notice shall be included in all  
  50.        copies or substantial portions of the Software.  
  51.        
  52.        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR  
  53.        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,  
  54.        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE  
  55.        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER  
  56.        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,  
  57.        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  
  58.        SOFTWARE.  
  59.  
  60.  
  61.     example text document:
  62.     set the clipboard to «data»
  63. set the clipboard to "\"Effective defenses 111 threats\" by John Galt
  64. https://discussions.apple.com/docs/DOC-8841
  65. \"Avoid phishing emails 222 and other scams\"
  66.  
  67. https://support.apple.com/en-ca/HT204759
  68.  
  69.  
  70.  
  71. blank lines
  72. also,see:http://www.google.com/ seeing again:http://www.google.com"
  73.  
  74.  *)
  75.  
  76.  
  77. -- Gets invoked here when you run in AppleScript editor or double click on the app icon.
  78. on run
  79.     global debug
  80.     set debug to 2
  81.     (* *)
  82.     set the clipboard to «data»
  83.    
  84.     (*
  85.       Full version<br>
  86.     <a href="http://www.parallels.com/en/products/desktop/">http://www.parallels.com/en/products/desktop/</a><br>
  87.     <br>
  88.     <b>VMware Fusion</b><br>
  89.     *)
  90.     set the clipboard to «data HTML2020202046756C6C2076657273696F6E3C62723E0A202020203C6120687265663D22687474703A2F2F7777772E706172616C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F223E687474703A2F2F7777772E706172616C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F3C2F613E3C62723E0A202020203C62723E0A202020203C623E564D7761726520467573696F6E3C2F623E3C62723E0A»
  91.    
  92.     (* *)
  93.    
  94.     set theList to clipboard info
  95.     printClipboardInfo(theList)
  96.    
  97.     set cbInfo to get (clipboard info) as string
  98.    
  99.     -- Most likely, if we have HTML data in the clipboard it will be from a web browser or Word.
  100.     if cbInfo contains "HTML" then
  101.        
  102.         log "Working with HTML Class data from clipboard."
  103.         set theBoard to the clipboard as «class HTML»
  104.         --log "Print out inputted HTML data on the clipboard..." -- it's just going to be a hex string. waste.
  105.         --log theBoard
  106.        
  107.         set normalHtml to do shell script "osascript -e 'try' -e 'get the clipboard as «class HTML»' -e 'end try' | awk '{sub(/«data HTML/, \"\") sub(/»/, \"\")} {print}' | xxd -r -p "
  108.         log "...Print out plan text version of inputed HTML data from the clipboard..." & return & normalHtml
  109.         log "printed in hex"
  110.         hexDumpFormat("normalHtml", normalHtml)
  111.        
  112.         set returnedData to adjustBrowserHTML(normalHtml)
  113.         log "...Print out plan text version of adjusted HTML data ..." & return & returnedData
  114.         log "...just printed plan text version"
  115.         log "printed in hex"
  116.         hexDumpFormat("returnedData", returnedData)
  117.        
  118.        
  119.         try
  120.             log "returnedData is " & returnedData
  121.         on error errStr number errorNumber
  122.             log "===> We didn't find HTML data.   errStr is " & errStr & " errorNumber is " & errorNumber
  123.             return
  124.         end try
  125.     else
  126.         -- will work with a plan text.
  127.         try
  128.             log "Working with plan text"
  129.             set clipboardData to (the clipboard as text)
  130.             if debug ≥ 2 then
  131.                 log "class clipboardData is " & class of clipboardData
  132.                 log "calling printHeader."
  133.             end if
  134.             printHeader("clipboardData", clipboardData)
  135.         on error errStr number errorNumber
  136.             log "===> We didn't find data on the clipboard.   errStr is " & errStr & " errorNumber is " & errorNumber
  137.             display dialog "We didn't find HTML source code nor plan text on the clipboard." & return & "Please copy from a different source." giving up after 15
  138.             return 1
  139.         end try
  140.        
  141.         set returnedData to common(clipboardData)
  142.     end if
  143.     log "place on the clipboard returnedData is " & returnedData
  144.     set classHTML to typeHTML(returnedData)
  145.     postToCLipboard(classHTML)
  146.     -- return code
  147.     return 0
  148.    
  149.    
  150. end run
  151.  
  152. -- Folder actions.
  153. -- Gets invoked here when something is dropped on the folder that this script is monitoring.
  154. -- Right click on the folder to be monitored. services > Folder Action Settup...
  155. on adding folder items to this_folder after receiving added_items
  156.     -- TBD
  157.    
  158. end adding folder items to
  159.  
  160.  
  161.  
  162. -- Gets invoked here when something is dropped on this AppleScript icon
  163. on open dropped_items
  164.     global debug
  165.     set debug to 2
  166.    
  167.     (*
  168.     -- Debug code.
  169.       set fileName to choose file with prompt "get file"
  170.       set dropped_items to {fileName}
  171.     *)
  172.     log "class of dropped_items is " & class of dropped_items
  173.     display dialog "You dropped " & (count of dropped_items) & " item or items." & return & "  Caveat emptor. You have been warned." giving up after 6
  174.    
  175.     set totalFileData to ""
  176.     repeat with droppedItem in dropped_items
  177.         log "The droppedItem is "
  178.         -- display dialog "processing file " & (droppedItem as string) giving up after 3
  179.         log droppedItem
  180.         log "class = " & class of droppedItem
  181.         set extIs to findExtension(droppedItem)
  182.         set extIsU to makeCaseUpper(extIs)
  183.         if extIsU is "HTML" or extIsU is "HTM" or extIsU is "TEXT" or extIsU is "TXT" then
  184.             try
  185.                 set theFile to droppedItem as string
  186.                 set theFile to open for access file theFile
  187.                 set allOfFile to read theFile
  188.                 close access theFile
  189.                 printHeader("read from file ( allOfFile )", allOfFile)
  190.                 set totalFileData to totalFileData & common(allOfFile)
  191.             on error theErrorMessage number theErrorNumber
  192.                 log theErrorMessage & "error number " & theErrorNumber
  193.                 close access theFile
  194.             end try
  195.            
  196.         else
  197.             -- we do not support this extension
  198.             display dialog "We only support files with extenstion of html, htm, text or txt in either case. Your file had a " & extIs & " extention. Skipping" giving up after 10
  199.            
  200.         end if
  201.     end repeat
  202.    
  203.     postToCLipboard(totalFileData)
  204.     -- return code
  205.     return 0
  206.    
  207. end open
  208.  
  209.  
  210. -- ------------------------------------------------------
  211. on common(clipboardData)
  212.     global debug
  213.     set lf to character id 10
  214.     -- Write a message into the event log.
  215.     log "  --- Starting on " & ((current date) as string) & " --- "
  216.     set cbInfo to get (clipboard info) as string
  217.    
  218.    
  219.     -- don't let Windoze confuse us. convert Return LineFeed to lf
  220.     set clipboardData to alterString(clipboardData, return & lf, lf)
  221.     -- might as will convert classic macOS return to lf. We will have to look for less things.
  222.     set clipboardData to alterString(clipboardData, return, lf)
  223.    
  224.     -- figure out what type of data we have: plan text or html source code text.
  225.     set paraCount to count of textToList(clipboardData, "<p")
  226.     set endparaCount to count of textToList(clipboardData, "</p>")
  227.     set titleCount to count of textToList(clipboardData, "<title")
  228.     set endTitleCount to count of textToList(clipboardData, "</title>")
  229.     set aLinkCount to count of textToList(clipboardData, "href=\"http")
  230.     -- mangled href="http
  231.     set mangledLinkCount to count of textToList(clipboardData, "href=\\\"http")
  232.     set brCount to count of textToList(clipboardData, "<br>")
  233.     if debug ≥ 1 then
  234.         log "Values used to distinguis HTML source code from plan text."
  235.         log "paraCount  is " & paraCount
  236.         log "endparaCount is " & endparaCount
  237.         log "titleCount is " & titleCount
  238.         log "endTitleCount is " & endTitleCount
  239.         log "aLinkCount is " & aLinkCount
  240.         log "brCount is " & brCount
  241.         log "mangledLinkCount is " & mangledLinkCount
  242.     end if
  243.     --set endHttpCount to count of textToList(clipboardData, "http://")
  244.     --set endHttpsCount to count of textToList(clipboardData, "https://")
  245.     -- note, textToList returns a count one greater than the actual because item one is the data before the first found entry.
  246.     if paraCount ≥ 4 and endparaCount ≥ 3 or brCount ≥ 4 or ((titleCount is endTitleCount) and titleCount ≥ 2) or aLinkCount ≥ 3 or mangledLinkCount ≥ 3 then
  247.         log "... found HTML input ... in plan text format."
  248.         -- strange \" are appearing in input text.  Probably the result of using TextEdit along the way.
  249.         -- quick hack.
  250.         set alteredClipboardData to alterString(clipboardData, "\\\"", "\"")
  251.         set readyData to typeHTML(alteredClipboardData)
  252.     else
  253.         log "... found plan Text input ..."
  254.         set readyData to typeText(clipboardData)
  255.        
  256.     end if
  257.     return readyData
  258. end common
  259. -- ------------------------------------------------------
  260. (*
  261.   We received HTML class data on the clipboard.  This is the manager.
  262.  *)
  263. on adjustBrowserHTML(normalHtml)
  264.     set lf to character id 10
  265.     -- don't let Windoze confuse us. convert Return LineFeed to lf
  266.     set normalHtml to alterString(normalHtml, return & lf, lf)
  267.     -- might as will convert classic macOS return to lf. We will have to look for less things.
  268.     set normalHtml to alterString(normalHtml, return, lf)
  269.     hexDumpFormat("normalHtml", normalHtml)
  270.    
  271.     set alteredHTML to adjustURLs(normalHtml)
  272.     return alteredHTML
  273. end adjustBrowserHTML
  274.  
  275. -- ------------------------------------------------------
  276. (*
  277. example:
  278.   Free version of Parallels for individual use:</p><p><br></p>
  279.   <p>https://itunes.apple.com/us/app/parallels-desktop-lite/id1085114709?mt=12</p>
  280.   <p><br></p>
  281.   <p>Full version</p><p><a href="http://www.parallels.com/en/products/desktop/" target="_blank">
  282.      http://www.parallels.com/en/products/desktop/</a>
  283.      
  284. If asc find a URL outside of an a tag, it will place blank lines around the URL. No, it will not go the
  285. full nine yards and place an a tag around the url.
  286.  
  287. *)
  288. on adjustURLs(theOriginalInputBuffer)
  289.     global debug
  290.     set theInputBuffer to theOriginalInputBuffer
  291.     set lf to character id 10
  292.    
  293.     set buildHTML to ""
  294.     if debug ≥ 3 then log "buildHTML [ should be empty string ] is " & buildHTML
  295.     set countI to 1 -- variable is used for debuging.
  296.     -- do until we have processed theInputBuffer
  297.     repeat until theInputBuffer is ""
  298.         log "at the top of theInputBuffer ........."
  299.        
  300.         set foundWhere to {}
  301.         repeat with lookCharacters in {"https://", "http://", "<a "}
  302.             copy (offset of lookCharacters in theInputBuffer) to the end of the foundWhere
  303.             try
  304.                 set tempLoc to (offset of lookCharacters in theInputBuffer)
  305.                 log "searching for " & lookCharacters & " found at offset  " & tempLoc & " contains " & text tempLoc thru (tempLoc + ((length of lookCharacters) - 1)) of theInputBuffer
  306.             end try
  307.         end repeat
  308.         log foundWhere
  309.         set foundMarker to (minimumPositiveNumber from foundWhere)
  310.         -- figure out what type of marker we got?
  311.        
  312.         -- None.  Reached the end of the data without finding one.
  313.         if foundMarker ≤ 0 then
  314.             -- we are done
  315.             log "Found all links."
  316.             set buildHTML to buildHTML & theInputBuffer
  317.             printHeader("buildHTML", buildHTML)
  318.             set theInputBuffer to ""
  319.             exit repeat -- ------ done processing theInputBuffer ------>
  320.         end if
  321.        
  322.         -- find which of three markers we found.
  323.         if (text foundMarker thru (foundMarker + 2) of theInputBuffer) is "<a " then
  324.             set actualMarker to "<a "
  325.         else if text foundMarker thru (foundMarker + 6) of theInputBuffer is "http://" then
  326.             set actualMarker to "http://"
  327.         else
  328.             -- just assume it's the remaining "https://" since we looked for just three.
  329.             set actualMarker to "https://"
  330.         end if
  331.         set actualMarkerOffset to ((length of actualMarker) - 1)
  332.         log "actualMarker is " & actualMarker & " actualMarkerOffset is " & actualMarkerOffset
  333.        
  334.         log "foundMarker is " & foundMarker & "  verify marker text is " & text foundMarker thru (foundMarker + actualMarkerOffset) of theInputBuffer
  335.        
  336.        
  337.        
  338.         if foundMarker ≥ 2 then
  339.             log "buildHTML is " & buildHTML & " length is " & length of buildHTML
  340.             hexDumpFormat("theInputBuffer", theInputBuffer)
  341.             log " (foundMarker - 1) is " & (foundMarker - 1)
  342.             -- get the proceding text
  343.             set buildHTML to buildHTML & text 1 thru (foundMarker - 1) of theInputBuffer
  344.             log "buildHTML is " & buildHTML
  345.             --printHeader("buildHTML", buildHTML)
  346.             hexDumpFormat("buildHTML", buildHTML)
  347.            
  348.             -- https://apple.stackexchange.com/a/20135/44531
  349.            
  350.             set theInputBuffer to text foundMarker thru -1 of theInputBuffer --trim off character before what we found
  351.             printHeader("theInputBuffer", theInputBuffer)
  352.             hexDumpFormat("theInputBuffer", theInputBuffer)
  353.         else
  354.             log "no proceeding data."
  355.         end if
  356.        
  357.         repeat 1 times -- interate loop
  358.            
  359.             -- example" the url is also the display text
  360.             -- <a href="https://discussions.apple.com/docs/DOC-8841" target="_blank">https://discussions.apple.com/docs/DOC-8841</a>
  361.             hexDumpFormat("theInputBuffer", theInputBuffer)
  362.             if text 1 thru (length of "<a ") of theInputBuffer is "<a " then
  363.                 -- found <a> tag
  364.                 -- find ending </a> tag
  365.                 set whereEnds to offset of "</a>" in theInputBuffer
  366.                 if whereEnds ≤ 0 then
  367.                     log "==> found an error in the HTML.  no ending </a>"
  368.                     set buildHTML to buildHTML & theInputBuffer
  369.                     printHeader("buildHTML", buildHTML)
  370.                     set theInputBuffer to ""
  371.                     exit repeat -- ------ next ------>
  372.                 end if
  373.                 set lastOffset to ((length of "</a>") - 1)
  374.                 log "lastCharacter is " & lastOffset
  375.                 set lastCharacter to whereEnds + lastOffset
  376.                 log "lastCharacter is " & lastCharacter
  377.                 -- needs to copy the ending ">"
  378.                 set buildHTML to buildHTML & text 1 thru lastCharacter of theInputBuffer
  379.                 printHeader("buildHTML", buildHTML)
  380.                 -- https://apple.stackexchange.com/a/20135/44531
  381.                 set theInputBuffer to text lastCharacter thru -1 of theInputBuffer --trim out <a>
  382.                 printHeader("theInputBuffer", theInputBuffer)
  383.                 exit repeat -- ------ next ------>
  384.             end if
  385.            
  386.             -- find the end of the HTML URL by splitting on blank or return
  387.             -- unsafe characters  <blank> " < > # % { } | \ ^ ~ [ ] `
  388.             -- and line-end
  389.             -- https://perishablepress.com/stop-using-unsafe-characters-in-urls/
  390.             -- the end of the clipboard string my end after the url, hence no " ", LF or CR
  391.             -- Rember, CRLF was converted to LF above
  392.             set endsWhere to {}
  393.             -- the end of the url ends with one of the not allowed characters + line-end
  394.             repeat with unsafeCharacter in {" ", "\"", lf, "<", ">", "#", "%", "{", "}", "|", "\\", "^", "~", "[", "]"}
  395.                 copy (offset of unsafeCharacter in theInputBuffer) to the end of the endsWhere
  396.             end repeat
  397.             log endsWhere
  398.             set endOfURL to (minimumPositiveNumber from endsWhere) - 1
  399.            
  400.             log "endOfURL is " & endOfURL
  401.            
  402.             if endOfURL ≤ 0 then
  403.                 -- We have reached the end of the input
  404.                 set theURL to theInputBuffer
  405.                 set theInputBuffer to ""
  406.             else
  407.                 set theURL to text 1 thru endOfURL of theInputBuffer
  408.                 log "from middle theURL is " & theURL
  409.                
  410.                 set theInputBuffer to text (endOfURL + 1) thru -1 of theInputBuffer -- trim off url in front.
  411.             end if
  412.             printHeader("printHeader", theInputBuffer)
  413.             log "----------------------- " & theURL & " -----------------------"
  414.             (*
  415.             retrieve the file pointed to by the URL so we can
  416.             get the title. Note: <title> can have attributes.  Example:
  417.                
  418.             <title data-test-page-title="Parallels Desktop Lite on the Mac App Store"
  419.             >‎Parallels Desktop Lite on the Mac App Store</title>
  420.            
  421.             *)
  422.            
  423.             -- Example:
  424.             -- curl --silent --location --max-time 10 <URL>
  425.             set toUnix to "curl --silent --location --max-time 10 " & quoted form of theURL
  426.             log "what we will use to retrieve the Url. toUnix  is " & return & "  " & toUnix
  427.             try
  428.                 log "reading link file to get title"
  429.                 set fromUnix to do shell script toUnix
  430.                 --log "fromUnix:"
  431.                 printHeader("fromUnix", fromUnix)
  432.                 -- may not be working with an HTLM document, so thefound title may be too long or confused.
  433.                 log "how far?..."
  434.                 -- there could be some bagage with the <title
  435.                 set actualTagData to tagContent(fromUnix, "<title", "</title>")
  436.                 -- Find what we will actually display in the title.
  437.                 -- Fix up gotchas.             
  438.                 log "actualTagData  is " & printHeader("actualTagData", actualTagData)
  439.                 if actualTagData is "" then
  440.                     set actualTagData to theURL
  441.                 else if length of actualTagData > 140 then
  442.                     log "length of actualTagData is " & length of actualTagData & "which is too long.  Truncated."
  443.                     set actualTagData to theURL
  444.                     -- curl https://appleid.apple.com returns <title>403 Forbidden</title>
  445.                     -- which is misleading.
  446.                 else if actualTagData contains "403" and actualTagData contains "Forbidden" then
  447.                     set actualTagData to theURL
  448.                 else
  449.                     -- there could be some attributes within the <title> tag.
  450.                     -- or there could not be
  451.                     -- an attribute could have a > in it. ignoring that for now.
  452.                     try
  453.                         -- find where <title ends
  454.                         set whereToEnd to (offset of ">" in actualTagData)
  455.                         log "whereToEnd is " & whereToEnd
  456.                         set whereToBegin to whereToEnd + (length of ">")
  457.                         log "whereToBegin is " & whereToBegin
  458.                         hexDumpFormat("actualTagData", actualTagData)
  459.                         set actualTagData to text whereToBegin thru (length of actualTagData) of actualTagData
  460.                         log "actualTagData is " & actualTagData
  461.                     on error theErrorMessage number theErrorNumber
  462.                         log "==>No ending greater than (>) for title. Badly contructed html." & return & "message is " & theErrorMessage & "error number " & theErrorNumber
  463.                         set actualTagData to actualTagData
  464.                         -- no need to repair.  It's not our page.
  465.                     end try
  466.                    
  467.                     -- found line-end in title.  caused confustion.
  468.                     -- note: this is new data and the multiple line-ends have not been
  469.                     -- filtered out.
  470.                     -- some joker had a line-end in the title!
  471.                     set actualTagData to alterString(actualTagData, return & lf, "  ")
  472.                     set actualTagData to alterString(actualTagData, return, " ")
  473.                     set actualTagData to alterString(actualTagData, lf, "  ")
  474.                     log "actualTagData has been chanaged which is  " & actualTagData
  475.                     hexDumpFormat("actualTagData", actualTagData)
  476.                 end if
  477.             on error errMsg number n
  478.                 log "==> Error occured when looking for title. " & errMsg & " with number " & n
  479.                 set actualTagData to theURL
  480.             end try
  481.             -- why the _blank in the <a>?
  482.             set assembled to "<a href=\"" & theURL & "\" target=\"_blank\">" & actualTagData & "</a>"
  483.             log "assembled  is " & assembled
  484.            
  485.             if (length of theInputBuffer)0 then
  486.                 -- We have reached the end of the input
  487.                 log "we have reached the end of the input."
  488.                 set buildHTML to buildHTML & assembled
  489.             else
  490.                 log "more input to process"
  491.                 set buildHTML to buildHTML & assembled
  492.             end if
  493.            
  494.             -- wrap up
  495.             --log "transformed text from buildHTML is  " & return & buildHTML
  496.             log "#" & countI & " transformed text from buildHTML is  " & return & buildHTML
  497.             -- number of links found
  498.             set countI to countI + 1
  499.            
  500.         end repeat -- used to interate
  501.     end repeat -- processing links in the input text
  502.    
  503.     return the buildHTML
  504.    
  505. end adjustURLs
  506.  
  507. -- ------------------------------------------------------
  508. (*
  509. alterString
  510.   thisText is the input string to change
  511.   delim is what string to change.  It doesn't have to be a single character.
  512.   replacement is the new string
  513.  
  514.   returns the changed string.
  515. *)
  516.  
  517. on alterString(thisText, delim, replacement)
  518.     set resultList to {}
  519.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  520.     try
  521.         set resultList to every text item of thisText
  522.         set text item delimiters to replacement
  523.         set resultString to resultList as string
  524.         set my text item delimiters to tid
  525.     on error
  526.         set my text item delimiters to tid
  527.     end try
  528.     return resultString
  529. end alterString
  530.  
  531. -- ------------------------------------------------------
  532. (*
  533.   Return the text to the right of theToken.
  534. *)
  535. on answerAndChomp(theString, theToken)
  536.     set debugging to false
  537.     set theOffset to offset of theToken in theString
  538.     if debugging then log "theOffset is " & theOffset
  539.     set theLength to length of theString
  540.     if theOffset > 0 then
  541.         set beginningPart to text 1 thru (theOffset - 1) of theString
  542.         if debugging then log "beginningPart is " & beginningPart
  543.        
  544.         set chompped to text theOffset thru theLength of theString
  545.         if debugging then log "chompped is " & chompped
  546.         return {chompped, beginningPart}
  547.     else
  548.         set beginningPart to ""
  549.         return {theString, beginningPart}
  550.     end if
  551.    
  552. end answerAndChomp
  553.  
  554. -- ------------------------------------------------------
  555. (*
  556.   Delete the leading part of the string until and including theToken.
  557. *)
  558. on chompLeftAndTag(theString, theToken)
  559.     set debugging to false
  560.     --log text 1 thru ((offset of "my" in s) - 1) of s
  561.     --set rightString to offset of theToken in theString thru count of theString of theString
  562.     set theOffset to offset of theToken in theString
  563.     if debugging then log "theOffset is " & theOffset
  564.     set theLength to length of theString
  565.     if debugging then log "theLength is " & theLength
  566.     if theOffset > 0 then
  567.         set chompped to text (theOffset + (length of theToken)) thru theLength of theString
  568.         if debugging then log "chompped is " & chompped
  569.         return chompped
  570.     else
  571.         return ""
  572.     end if
  573. end chompLeftAndTag
  574.  
  575. -- ------------------------------------------------------  
  576. (*
  577. Yvan Koenig
  578. https://macscripter.net/viewtopic.php?id=43133
  579. *)
  580. on findExtension(inputFileName)
  581.     set fileName to inputFileName as string
  582.     set saveTID to AppleScript's text item delimiters
  583.     set AppleScript's text item delimiters to {"."}
  584.     set theExt to last text item of fileName
  585.     set AppleScript's text item delimiters to saveTID
  586.     --log "theExt is " & theExt
  587.     if theExt ends with ":" then set theExt to text 1 thru -2 of theExt
  588.     --log "theExt is " & theExt
  589.     return theExt
  590. end findExtension
  591.  
  592. -- ------------------------------------------------------
  593. (*
  594.   http://krypted.com/mac-os-x/to-hex-and-back/
  595.   0000000: 3c68 746d 6c3e 3c68 6561 643e 3c6d 6574  <html><head><met
  596. "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e"
  597.  
  598.  
  599.   *)
  600. on hexDumpFormat(textMessage, hex)
  601.     global debug
  602.     if debug ≥ 3 then log "in hexDumpFormat"
  603.     if debug ≥ 3 then log "hex string is " & hex
  604.     -- -r -p
  605.     set toUnix to "/bin/echo -n " & (quoted form of hex) & " | xxd  "
  606.     if debug ≥ 3 then log "toUnix is " & toUnix
  607.     try
  608.         set fromUnix to do shell script toUnix
  609.         log "variable " & textMessage & " in hex is " & return & "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e" & return & fromUnix
  610.     on error errMsg number n
  611.         log "==> convert hex string to string failed. " & errMsg & " with number " & n
  612.     end try
  613. end hexDumpFormat
  614.  
  615.  
  616. -- ------------------------------------------------------
  617. (*
  618.  
  619. https://stackoverflow.com/questions/55838252/minimum-value-that-not-zero
  620.        set m to get minimumPositiveNumber from {10, 2, 0, 2, 4}
  621.     log "m is " & m
  622.     set m to minimumPositiveNumber from {0, 0, 0}
  623.     log "m is " & m
  624. *)
  625. on minimumPositiveNumber from L
  626.     local L
  627.    
  628.     if L = {} then return null
  629.    
  630.     set |ξ| to 0
  631.    
  632.     repeat with x in L
  633.         set x to x's contents
  634.         if (x < |ξ| and x ≠ 0) ¬
  635.             or |ξ| = 0 then ¬
  636.             set |ξ| to x
  637.     end repeat
  638.    
  639.     |ξ|
  640. end minimumPositiveNumber
  641.  
  642. -- ------------------------------------------------------
  643. (*
  644.   makeCaseUpper("Now is the time, perhaps, for all good men")
  645. *)
  646. on makeCaseUpper(theString)
  647.     set UC to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  648.     set LC to "abcdefghijklmnopqrstuvwxyz"
  649.     set C to characters of theString
  650.     repeat with ch in C
  651.         if ch is in LC then set contents of ch to item (offset of ch in LC) of UC
  652.     end repeat
  653.     return C as string
  654. end makeCaseUpper
  655.  
  656. -- ------------------------------------------------------
  657. on postToCLipboard(pleasePost)
  658.     try
  659.         -- osascript -e "set the clipboard to «data HTML${hex}»"     
  660.         set toUnixSet to "osascript -e \"set the clipboard to «data HTML" & pleasePost & \""
  661.         log "toUnixSet is " & printHeader("toUnixSet", toUnixSet)
  662.        
  663.         set fromUnixSet to do shell script toUnixSet
  664.         log "fromUnixSet is " & fromUnixSet
  665.        
  666.     on error errMsg number n
  667.         log "==> We tried to send back HTML data, but failed. " & errMsg & " with number " & n
  668.     end try
  669.     -- see what ended up on the clipboard
  670.     set theList2 to clipboard info
  671.     printClipboardInfo(theList2)
  672. end postToCLipboard
  673.  
  674. -- ------------------------------------------------------
  675. on printClipboardInfo(theList)
  676.     log (clipboard info)
  677.     log class of theList
  678.     log "Data types on the clipboard ... "
  679.     printList("", theList)
  680.     log "... "
  681. end printClipboardInfo
  682.  
  683. -- ------------------------------------------------------
  684. (* Pump out the beginning of theString *)
  685. on printHeader(theName, theString)
  686.     global debug
  687.     if debug ≥ 3 then
  688.         log "in printHeader"
  689.         log theString
  690.         log length of theString
  691.     end if
  692.     if length of theString ≤ 0 then
  693.         log "==> no string to print"
  694.     else
  695.         log theName & " is " & text 1 thru (minimumPositiveNumber from {400, length of theString}) of theString & "<+++++++++"
  696.     end if
  697. end printHeader
  698.  
  699. -- ------------------------------------------------------
  700. (*
  701. print out the items in a list
  702.  
  703. *)
  704.  
  705. on printList(theName, splits)
  706.     try
  707.         set theCount to 1
  708.         repeat with theEntry in splits
  709.             --log "class of theEntry is " & class of theEntry
  710.             set classDisplay to class of theEntry as text
  711.             --log "classDisplay is " & classDisplay as text
  712.             --log "class of classDisplay is " & class of classDisplay
  713.             if classDisplay is "list" then
  714.                 log "    " & theName & theCount & " is " & item 1 of theEntry & "; " & item 2 of theEntry
  715.             else
  716.                 log "    " & theName & theCount & " is " & theEntry
  717.             end if
  718.             set theCount to theCount + 1
  719.         end repeat
  720.     on error errMsg number n
  721.         log "==> No go in printList. " & errMsg & " with number " & n
  722.     end try
  723. end printList
  724.  
  725. -- ------------------------------------------------------
  726. (*
  727. splitTextToList seems to be what you are trying to do
  728.   thisText is the input string
  729.   delim is what to split on
  730.  
  731.   results returned in a list
  732.  
  733.   Total hack. We know splitTextToList strips of delim so add it back.
  734. *)
  735.  
  736. on splitTextToList(thisText, delim)
  737.    
  738.     set returnedList to textToList(thisText, delim)
  739.     set resultArray to {}
  740.     copy item 1 of returnedList to the end of the resultArray
  741.    
  742.     repeat with i from 2 to (count of returnedList) in returnedList
  743.         set newElement to delim & item i of returnedList
  744.         copy newElement to the end of the resultArray
  745.     end repeat
  746.    
  747.     return resultArray
  748. end splitTextToList
  749.  
  750. -- ------------------------------------------------------
  751. (*
  752.   Retrieved data between "begin" and "end" tag. Whatever is between the strings.
  753. *)
  754. on tagContent(theString, startTag, endTag)
  755.     global debug
  756.     try
  757.         log "in tabContent. " & return & "    startTag is " & startTag & " endTag is " & endTag
  758.         set beginningOfTag to chompLeftAndTag(theString, startTag)
  759.         if length of beginningOfTag ≤ 0 then
  760.             set middleText to ""
  761.         else
  762.             printHeader("beginningOfTag", beginningOfTag)
  763.             set endingOffset to (offset of endTag in beginningOfTag)
  764.            
  765.             if endingOffset ≤ (length of endTag) then
  766.                 set middleText to ""
  767.             else
  768.                 set middleText to text 1 thru (endingOffset - 1) of beginningOfTag
  769.                 printHeader("middleText is ", middleText)
  770.             end if
  771.         end if
  772.     on error errMsg number n
  773.         log "finding contained text failed. " & errMsg & " with number " & n
  774.         set middleText to ""
  775.     end try
  776.     if debug ≥ 2 then log "returning with middleText is " & middleText
  777.     return middleText
  778. end tagContent
  779.  
  780. (*
  781. textToList seems to be what you are trying to do
  782.   thisText is the input string
  783.   delim is what to split on
  784.  
  785.   returns a list of strings.  
  786.  
  787. - textToList was found here:
  788. - http://macscripter.net/viewtopic.php?id=15423
  789.  
  790. *)
  791.  
  792. on textToList(thisText, delim)
  793.     set resultList to {}
  794.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  795.    
  796.     try
  797.         set resultList to every text item of thisText
  798.         set my text item delimiters to tid
  799.     on error
  800.         set my text item delimiters to tid
  801.     end try
  802.     return resultList
  803. end textToList
  804.  
  805. -- ------------------------------------------------------
  806. on typeHTML(theData)
  807.     global debug
  808.     log "in typeHTML" & return & "  Try to send back HTML."
  809.     try
  810.         set clipboardDataQuoted to quoted form of theData
  811.         log "quoted form is " & printHeader("clipboardDataQuoted", clipboardDataQuoted)
  812.         -- make hex string as required for HTML data on the clipboard
  813.         set toUnix to "/bin/echo -n " & clipboardDataQuoted & " | hexdump -ve '1/1 \"%.2x\"'"
  814.         log "toUnix is " & printHeader("toUnix", toUnix)
  815.        
  816.         set fromUnix to do shell script toUnix
  817.        
  818.         log "fromUnix is " & printHeader("fromUnix", fromUnix)
  819.         if debug ≥ 2 then
  820.             log "displaying original string --- so we can tell if it converted successfully. "
  821.             --hexDumpFormat("fromUnix", fromUnix)
  822.         end if
  823.     on error errMsg number n
  824.         log "==> convert to hex string failed. " & errMsg & " with number " & n
  825.         set fromUnix to ""
  826.     end try
  827.     return fromUnix
  828. end typeHTML
  829.  
  830. -- ------------------------------------------------------
  831. on typeText(theData)
  832.     (*
  833.          Unix-like systems      LF      0A      \n
  834.             (Linux, macOS)
  835.                Microsoft Windows    CRLF    0D 0A   \r\n
  836.                classic Mac OS       CR      0D          \r   Applescript return
  837.          *)
  838.     global debug
  839.     set lf to character id 10
  840.     log "in typeText"
  841.     printHeader("the input  ( theData )", theData)
  842.     -- Example: -- https://discussions.apple.com/docs/DOC-8841
  843.     -- locate links
  844.    
  845.     set theOutputBuffer to theData
  846.     adjustURLs(theOutputBuffer)
  847.     (* add paragraphs *)
  848.    
  849.     -- start the theOutputBuffer with a paragraph tag.  We are taking a simple approach at this time.
  850.     set theOutputBuffer to "<p>" & theOutputBuffer
  851.     --  LF
  852.     -- Remember CRLF was changed to LF above and CR was chanaged to LF above.
  853.     -- we don't want no Windoze problems
  854.     set theOutputBuffer to alterString(theOutputBuffer, lf & lf, "</p><p> </p><p>")
  855.    
  856.     -- Does the string end with a dangling paragraph?  
  857.     if debug ≥ 3 then
  858.         log "length of theOutputBuffer is " & length of theOutputBuffer
  859.         log "((length of theOutputBuffer) - 2) is " & ((length of theOutputBuffer) - 2)
  860.         log "(length of theOutputBuffer)  is " & (length of theOutputBuffer)
  861.         log "((length of theOutputBuffer) - 3) is " & ((length of theOutputBuffer) - 3)
  862.     end if
  863.     if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is "<p>" then
  864.         set theOutputBuffer to text 1 thru ((length of theOutputBuffer) - 3) of theOutputBuffer
  865.     else if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is not "</p>" then
  866.         set theOutputBuffer to theOutputBuffer & "</p>"
  867.     end if
  868.    
  869.     log "theOutputBuffer is " & return & theOutputBuffer
  870.    
  871.     --convert to html clipboard format
  872.     return typeHTML(theOutputBuffer)
  873.    
  874. end typeText
  875.  
  876.  
  877.  
  878. (*
  879. https://www.oreilly.com/library/view/applescript-the-definitive/0596102119/re89.html
  880.  
  881. https://stackoverflow.com/questions/11085654/apple-script-how-can-i-copy-html-content-to-the-clipboard
  882.  
  883. -- user has copied a file's icon in the Finder
  884. clipboard info
  885. -- {{string, 20}, {«class ut16», 44}, {«class hfs », 80}, {«class
  886.  utf8», 20}, {Unicode text, 42}, {picture, 2616}, {«class icns», 43336},
  887. {«class furl», 62}}
  888.  
  889. textutil -convert html foo.rtf
  890.  
  891. if ((clipboard info) as string) contains "«class furl»" then
  892.         log "the clipboard contains a file named " & (the clipboard as string)
  893.     else
  894.         log "the clipboard does not contain a file"
  895.     end if
  896.    
  897. the clipboard       required
  898. as  class   optional
  899.  
  900. tell application "Script Editor"
  901.         activate
  902.     end tell
  903.    
  904. textutil has a simplistic text to html conversion
  905.     set clipboardDataQuoted to quoted form of theData
  906.     log "quoted form is " & clipboardDataQuoted
  907.    
  908.     set toUnix to "/bin/echo -n " & clipboardDataQuoted
  909.     set toUnix to toUnix & " | textutil -convert html -noload -nostore -stdin -stdout "
  910.     log "toUnix is " & toUnix
  911.     set fromUnix to do shell script toUnix
  912.     log "fromUnix  is " & fromUnix
  913.    
  914.    
  915. set s to "Today is my birthday"
  916. log text 1 thru ((offset of "my" in s) - 1) of s
  917. --> "Today is "
  918.             -- text 1 thru ((offset of "my" in s) - 1) of s
  919.             -- -1 since offset return the first character "m" position count
  920.            
  921. log "beginningOfTag is " & text 1 thru (minimumPositiveNumber from {200, length of beginningOfTag}) of beginningOfTag & "<+++++++++++++++++++++++"
  922.  
  923. https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_cmds.html
  924.  
  925. *)
  926.  
  927. --mac $ hex=`echo -n "<p>your html code here</>" | hexdump -ve '1/1 "%.2x"'`
  928. --mac $ echo $hex
  929. --3c703e796f75722068746d6c20636f646520686572653c2f3e
  930. --mac $ osascript -e "set the clipboard to «data HTML${hex}»"
  931. --mac $
  932. (*  
  933. A sub-routine for encoding ASCII characters.  
  934.  
  935. encode_char("$")  
  936. --> returns: "%24"  
  937.  
  938. based on:  
  939. https://www.macosxautomation.com/applescript/sbrt/sbrt-08.html  
  940.  
  941. *)
  942. (*
  943. Lowest Numeric Value in a List
  944.  
  945. This sub-routine will return the lowest numeric value in a list of items. The passed list can contain non-numeric data as well as lists within lists. For example:
  946.  
  947. lowest_number({-3.25, 23, 2345, "sid", 3, 67})
  948. --> returns: -3.25
  949. lowest_number({-3.25, 23, {-22, 78695, "bob"}, 2345, true, "sid", 3, 67})
  950. --> returns: -22
  951.  
  952. If there is no numeric data in the passed list, the sub-routine will return a null string ("")
  953.  
  954. lowest_number({"this", "list", "contains", "only", "text"})
  955. --> returns: ""
  956.  
  957. https://macosxautomation.com/applescript/sbrt/sbrt-03.html
  958.  
  959. Here's the sub-routine:
  960.  
  961. *)
  962. (*
  963. on lowestNumber(values_list)
  964.     set the low_amount to ""
  965.     repeat with i from 1 to the count of the values_list
  966.         set this_item to item i of the values_list
  967.         set the item_class to the class of this_item
  968.         if the item_class is in {integer, real} then
  969.             if the low_amount is "" then
  970.                 set the low_amount to this_item
  971.             else if this_item is less than the low_amount then
  972.                 set the low_amount to item i of the values_list
  973.             end if
  974.         else if the item_class is list then
  975.             set the low_value to lowest_number(this_item)
  976.             if the the low_value is less than the low_amount then ¬
  977.                 set the low_amount to the low_value
  978.         end if
  979.     end repeat
  980.     return the low_amount
  981. end lowestNumber
  982.  
  983. https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  984. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  985.  
  986. set minimum to 9.9999999999E+12
  987. set maximum to 0
  988. repeat with ref_to_value in list_of_values
  989.     set the_value to contents of ref_to_value
  990.     if the_value > maximum then set maximum to the_value
  991.     if the_value < minimum then set minimum to the_value
  992. end repeat
  993.  
  994. {minimum, maximum}
  995.  
  996. may do the trick.
  997.  
  998. Yvan KOENIG (VALLAURIS, France) lundi 13 septembre 2010 22:32:41
  999. *)
  1000. (* https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  1001. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  1002.  
  1003. set minimum to 9.9999999999E+12
  1004.  
  1005. assume it's limited to positive values
  1006.  
  1007.  
  1008. on maxValue(list_of_values)
  1009.     global debug
  1010.     if debug ≥ 3 then log "in maxValue " & return & list_of_values
  1011.     set maximum to 0
  1012.     repeat with ref_to_value in list_of_values
  1013.         set the_value to contents of ref_to_value
  1014.         if the_value > maximum then set maximum to the_value
  1015.     end repeat
  1016.     if debug ≥ 3 then log maximum
  1017.     return maximum
  1018. end maxValue
  1019. *)
  1020. -- ------------------------------------------------------
  1021. (*
  1022. http://harvey.nu/applescript_url_encode_routine.html
  1023.  
  1024. on urlencode(theText)
  1025.     set theTextEnc to ""
  1026.     repeat with eachChar in characters of theText
  1027.         set useChar to eachChar
  1028.         set eachCharNum to ASCII number of eachChar
  1029.         if eachCharNum = 32 then
  1030.             set useChar to "+"
  1031.         else if (eachCharNum ≠ 42) and (eachCharNum ≠ 95) and (eachCharNum < 45 or eachCharNum > 46) and (eachCharNum < 48 or eachCharNum > 57) and (eachCharNum < 65 or eachCharNum > 90) and (eachCharNum < 97 or eachCharNum > 122) then
  1032.             set firstDig to round (eachCharNum / 16) rounding down
  1033.             set secondDig to eachCharNum mod 16
  1034.             if firstDig > 9 then
  1035.                 set aNum to firstDig + 55
  1036.                 set firstDig to ASCII character aNum
  1037.             end if
  1038.             if secondDig > 9 then
  1039.                 set aNum to secondDig + 55
  1040.                 set secondDig to ASCII character aNum
  1041.             end if
  1042.             set numHex to ("%" & (firstDig as string) & (secondDig as string)) as string
  1043.             set useChar to numHex
  1044.         end if
  1045.         set theTextEnc to theTextEnc & useChar as string
  1046.     end repeat
  1047.     return theTextEnc
  1048. end urlencode
  1049.  
  1050. Clipboard classes after a copy from the application.
  1051. from waterfox
  1052. (*«class HTML», 13876, «class utf8», 505, «class ut16», 1012, string, 505, Unicode text, 1010*)
  1053.  
  1054. from chrome
  1055. (*«class HTML», 748, «class utf8», 204, «class ut16», 410, string, 204, Unicode text, 408*)
  1056.  
  1057. from safari
  1058. (*«class weba», 120785, «class RTF », 70255, «class HTML», 122811, «class utf8», 3370, «class ut16», 6772, uniform styles, 47132, string, 3385, scrap styles, 8122, Unicode text, 6732, uniform styles, 47132, scrap styles, 8122*)
  1059.  
  1060. iCab
  1061. (*«class weba», 1665, «class RTF », 763, «class utf8», 121, «class ut16», 244, uniform styles, 376, string, 121, scrap styles, 62, Unicode text, 242, uniform styles, 376, scrap styles, 62*)
  1062.  
  1063. Opera
  1064. (*«class HTML», 5767, «class utf8», 150, «class ut16», 302, string, 150, Unicode text, 300*)
  1065.  
  1066. Textedit
  1067. (*«class RTF », 1136, «class utf8», 138, «class ut16», 278, uniform styles, 148, string, 138, scrap styles, 22, Unicode text, 276, uniform styles, 148, scrap styles, 22*)
  1068.  
  1069. Word
  1070. (*«class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, «class RTF », 30573, «class HTML», 21160, scrap styles, 22, uniform styles, 136, string, 210, Unicode text, 420, «class PDF », 13197, picture, 154058, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232, «class ut16», 422, «class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, scrap styles, 22, uniform styles, 136, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232*)
  1071.  
  1072. TextWrangler
  1073. (*«class utf8», 185, «class BBLM», 4, «class ut16», 372, string, 185, Unicode text, 370, «class BBLM», 4*)
  1074.  
  1075. *)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement