Jump to content
Macro Express Forums

Challenge: a macro to delete all but the first letter in every word


Recommended Posts

I have a friend who is a storyteller:  he writes original tales, memorizes them, and tells his stories in front of live audiences (which these days happen over Zoom).

 

He uses a "trick" that help him memorize new stories more quickly. Once he has drafted a story, he deletes all but the first letter of every word.

 

So this:

 

Once upon a time, there was an ogre named Sandor.

 

Sandor lived in a 1000-room "castle" made of ice!

 

becomes this:

O u a t, t w a o n S.


S l i a 1000-r "c" m o i!

 

He memorizes stories by studying the abbreviated versions instead of the drafts.

Until recently, he abbreviated his stories manually -- a time-consuming task. He asked whether I could write a macro to automate the process.

 

The challenge: using Macro Express, process a text to delete all letters except for the first letter in every word. Keep all spaces, digits, punctuation marks, paragraph breaks, etc.

 

There are many ways to tackle this problem. I came up with two different solutions. Perhaps there is a clever RegEx solution, although I didn't attempt this!

Looking forward to seeing how others might solve this. Although both my solutions work, they have "issues." In one, the logic is convoluted. The other is simpler, but includes a cringe-worthy step that I wish wasn't necessary.

Link to comment
Share on other sites

I would evaluate every character. Every character that's not a-z or A-Z in decimal value I store in a new variable else if the preceding character was a space, save it also. 

Link to comment
Share on other sites

Hi Cory,

 

If you have time, I encourage you to implement your solution. When I tried something similar, a logic flaw marred the result.

 

I would be curious to see how you overcome a limitation that might be inherent in the approach. When I said I had to do something "cringe-worthy" to my script, I meant that I had to add code to deal with the problem.

Link to comment
Share on other sites

1 hour ago, rberq said:

The problem I am having is punctuation marks like quotes and parentheses that may or may not be preceded/followed by a blank space. 

 

Could you give an example of a text that does NOT transform properly? Reading your post makes me nervous that I didn't test my solution carefully enough!

 

Although I didn't say this when issuing this challenge, perhaps this clarification is in order. Before abbreviation, the text should follow English-language rules for punctuation, spacing, use of symbols, etc.

 

In other words, there's no expectation that the macro will work on non-standard mashups of symbols, punctuation marks, numbers, and letters, like this:

 

%g 4,/rtR< +G.$p  ?t'9

 

Link to comment
Share on other sites

Here's the text I'm converting, followed by the result.  Part of the text is yours, part of it is my notes as to how the macro should operate.  There are still some problems with double-quotes and with parentheses.

 

TEXT:
       Once upon a time, there was an ogre named Sandor.

 

Sandor lived in a 1000-room "castle" made of ice!

 

Change all punctuation marks to "blank punctuation blank".
Change CR-LF to a single character BEL that is unused elsewhere (surrounded by blanks).
Insert one dummy blank at the beginning of the text.
Change all multiple consecutive blanks to single blanks.
Remove first character from text (must be a blank).
Repeat until done:
   Extract first character into result, remove from text.
   Locate first blank in text, remove all characters up to that position.
Repeat end
Change all punctuation marks from "blank punctuation blank" to "punctuation".  
Change selected punctuation marks to "punctuation blank" (periods, commas, etc.)  

 

RESULT:
O u a t,  t w a o n S.  

 

S l i a 1- r"  c"  m o i!  

 

C a p m t"  b p b" .  
C C- L t a s c B t i u e  (s b b).  
I o d b a t b o t t.  
C a m c b t s b.  
R f c f t  (m b a b).  
R u d
E f c i r,  r f t.  
L f b i t,  r a c u t t p.  
R e
C a p m f"  b p b"  t"  p" .  
C s p m t"  p b"   (p,  c,  e. )

Link to comment
Share on other sites

You're so close!  My first attempts had similar problems. After many tweaks, I finally managed to get the macro to work.

 

But when I reviewed my code later, I couldn't follow the logic. The script had degenerated into a twisted mess of cascading IF-THEN statements, ranges of ASCII values, and multiple Boolean variables. I tinkered with the code in an effort to render it understandable, but every simplification "broke" the macro.

 

I awoke the next morning with an idea for a different method, similar to Cory's. When I tried it, my new macro ALMOST worked. I made one (ugly) change and it worked. Compared to the original, the new macro has half the code and runs a little faster.

Link to comment
Share on other sites

OK, this seems to be working.  There are punctuation marks that are not in the text used for testing, and they are not accounted for in the macro.  I think to improve the macro, I would try a slightly different technique for punctuation and things like TAB characters -- perhaps replace each punctuation character with ESC and a digit.  For example, ESC1=period, ESC2=comma, etc.  Then in the result change the ESC sequences back to period, comma, and so on.  

//  
Log Errors to "C:\Temp\MacroExpressProLogFiles\MacroExpressPro_Macro_Log_File.txt"
Log Messages to "C:\Temp\MacroExpressProLogFiles\MacroExpressPro_Macro_Log_File.txt"
  "Macro executed: Temp_Extract_First_Letters"
Log Errors to Default Macro Log
// xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Keystroke Speed: 10 milliseconds
Mouse Speed: 30 milliseconds
// xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
//  
// Null character ascii 0
Variable Set to ASCII Char 0 to %NULL%
// BEL character ascii 7
Variable Set to ASCII Char 7 to %BEL%
// Escape character ascii 9
Variable Set to ASCII Char 27 to %ESC%
// Tab character ascii 9
Variable Set to ASCII Char 9 to %TAB%
// Line Feed (New Line) character ascii 10
Variable Set to ASCII Char 10 to %LINEFEED%
// Carriage Return character ascii 13
Variable Set to ASCII Char 13 to %CARRIAGERETURN%
// Carriage Return / Line Feed combination characters ascii 13 + ascii 10
Variable Set to ASCII Char 13 to %CRLF%
Variable Modify String %CRLF%: Append Text String Variable (%LINEFEED%)
// STX character ascii 2
Variable Set to ASCII Char 2 to %STX%
// ETX character ascii 3
Variable Set to ASCII Char 3 to %ETX%
// DC1 character ascii 17
Variable Set to ASCII Char 17 to %DC1%
// DC2 character ascii 18
Variable Set to ASCII Char 18 to %DC2%
//  
//  
Variable Set String %text% to "       Once upon a time, there was an ogre named Sandor.

Sandor lived in a 1000-room "castle" made of ice!

Change all punctuation marks to "blank punctuation blank".
Change CR-LF to a single character BEL that is unused elsewhere (surrounded by blanks).
Insert one dummy blank at the beginning of the text.
Change all multiple consecutive blanks to single blanks.
Remove first character from text (must be a blank).
Repeat until done:
   Extract first character into result, remove from text.
   Locate first blank in text, remove all characters up to that position.
Repeat end
Change all punctuation marks from "blank punctuation blank" to "punctuation".  
Change selected punctuation marks to "punctuation blank" (periods, commas, etc.)  "
Text Box Display: Diagnostics -- show text
If Variable %text% Contains "%CRLF%"
Text Box Display: Diagnostics
End If
// Insert one blank at beginning of text
Variable Set String %temptext% to " "
Variable Modify String %temptext%: Append Text String Variable (%text%)
Variable Modify String %text%: Copy Whole Text (%temptext%)
Text Box Display: Diagnostics -- text with extra blank at beginning
// Change all multiple consecutive blanks to single blanks.  
Repeat Start (Repeat 20 times)
  Variable Modify String: Replace "  " in %text% with " "
End Repeat
Text Box Display: Diagnostics -- text with multi blanks compressed to single blanks
// Surround punctuation marks with blanks
Variable Modify String: Replace "." in %text% with " . "
Variable Modify String: Replace "," in %text% with " , "
Variable Modify String: Replace "-" in %text% with " - "
Variable Modify String: Replace "!" in %text% with " ! "
Variable Modify String: Replace "(" in %text% with " ( "
Variable Modify String: Replace ")" in %text% with " ) "
Variable Modify String: Replace " "" in %text% with " %DC1% " // change left double-quote to DC1
Variable Modify String: Replace """ in %text% with " %DC2% " // change right double-quote to DC2 (all remaining double-quotes)
Variable Modify String: Replace "%CRLF%" in %text% with " %BEL% " // Change two-character sequence CRLF to single character BEL
Text Box Display: Diagnostics -- text after punctuation expansion
// Remove first character from text
Variable Modify String %text%: Delete a substring starting at 1 and 1 characters long
Text Box Display: Diagnostics -- text after deleting leading blank
// Extract first character of each word -- punctuation marks count as words
Variable Set String %result% to "" // Clear result text
Variable Modify String %text%: Append Text ( !@#$%^&*()) // Append garbage to end of text to identify where the end is
Repeat Until %text% Equals " !@#$%^&*()"
  Variable Modify String: Copy a substring in %text%, starting at 1 and 1 characters long to %onechar% // One character from text into temporary area
  Variable Modify String %result%: Append Text String Variable (%onechar%) // Append extracted character to result
  Variable Modify String %result%: Append Text ( ) // Append one blank to result
  Variable Set Integer %index% to the position of " " in %text%
  Variable Modify String %text%: Delete a substring starting at 1 and %index% characters long // Delete from text up to and including next blank space
End Repeat
Text Box Display: Diagnostics -- text after processing to end
Text Box Display: Diagnostics -- result text with one character per word
// Change all multiple consecutive blanks to single blanks.  
// Adjust selected punctuation marks
Repeat Start (Repeat 20 times)
  Variable Modify String: Replace " ." in %result% with ". " // period
  Variable Modify String: Replace " ," in %result% with ", " // comma
  Variable Modify String: Replace " -" in %result% with "-" // hyphen
  Variable Modify String: Replace "- " in %result% with "-" // hyphen
  Variable Modify String: Replace " !" in %result% with "! " // exclamation point
  Variable Modify String: Replace "!" in %result% with "! " // exclamation point
  Variable Modify String: Replace "( " in %result% with " (" // left paren
  Variable Modify String: Replace "(" in %result% with " (" // left paren
  Variable Modify String: Replace " )" in %result% with ")" // right paren
  Variable Modify String: Replace "%DC1% " in %result% with " "" // left double-quote
  Variable Modify String: Replace " %DC2% " in %result% with """ // right double-quote
  Variable Modify String: Replace "%BEL% " in %result% with "%CRLF%" // Change BEL characters back to CRLF
End Repeat
Repeat Start (Repeat 20 times)
  Variable Modify String: Replace "  " in %result% with " " // double space to single space
End Repeat
If Variable %result% Contains "%CRLF%"
Text Box Display: Diagnostics
End If
Text Box Display: Diagnostics -- final result after punctuation adjustment
//  
//  
Macro Return
//  
//  
.
.
.
.
<COMMENT Value=" "/>
<LOG ERRORS Filename="C:\\Temp\\MacroExpressProLogFiles\\MacroExpressPro_Macro_Log_File.txt" Hide_Errors="TRUE"/>
<LOG MESSAGES Filename="C:\\Temp\\MacroExpressProLogFiles\\MacroExpressPro_Macro_Log_File.txt" Message="Macro executed: Temp_Extract_First_Letters" Stamp="TRUE"/>
<LOG ERRORS Hide_Errors="FALSE" _ENABLED="FALSE"/>
<COMMENT Value="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"/>
<KEYSTROKE SPEED Delay="10"/>
<MOUSE SPEED Delay="30"/>
<COMMENT Value="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"/>
<COMMENT Value=" "/>
<COMMENT Value="Null character ascii 0"/>
<VARIABLE SET TO ASCII CHAR Value="0" Destination="%NULL%"/>
<COMMENT Value="BEL character ascii 7"/>
<VARIABLE SET TO ASCII CHAR Value="7" Destination="%BEL%"/>
<COMMENT Value="Escape character ascii 9"/>
<VARIABLE SET TO ASCII CHAR Value="27" Destination="%ESC%"/>
<COMMENT Value="Tab character ascii 9"/>
<VARIABLE SET TO ASCII CHAR Value="9" Destination="%TAB%"/>
<COMMENT Value="Line Feed (New Line) character ascii 10"/>
<VARIABLE SET TO ASCII CHAR Value="10" Destination="%LINEFEED%"/>
<COMMENT Value="Carriage Return character ascii 13"/>
<VARIABLE SET TO ASCII CHAR Value="13" Destination="%CARRIAGERETURN%"/>
<COMMENT Value="Carriage Return / Line Feed combination characters ascii 13 + ascii 10"/>
<VARIABLE SET TO ASCII CHAR Value="13" Destination="%CRLF%"/>
<VARIABLE MODIFY STRING Option="\x07" Destination="%CRLF%" Variable="%LINEFEED%" NoEmbeddedVars="FALSE"/>
<COMMENT Value="STX character ascii 2"/>
<VARIABLE SET TO ASCII CHAR Value="2" Destination="%STX%"/>
<COMMENT Value="ETX character ascii 3"/>
<VARIABLE SET TO ASCII CHAR Value="3" Destination="%ETX%"/>
<COMMENT Value="DC1 character ascii 17"/>
<VARIABLE SET TO ASCII CHAR Value="17" Destination="%DC1%"/>
<COMMENT Value="DC2 character ascii 18"/>
<VARIABLE SET TO ASCII CHAR Value="18" Destination="%DC2%"/>
<COMMENT Value=" "/>
<COMMENT Value=" "/>
<VARIABLE SET STRING Option="\x00" Destination="%text%" Value="       Once upon a time, there was an ogre named Sandor.\r\n\r\nSandor lived in a 1000-room \"castle\" made of ice! \r\n\r\nChange all punctuation marks to \"blank punctuation blank\".\r\nChange CR-LF to a single character BEL that is unused elsewhere (surrounded by blanks).\r\nInsert one dummy blank at the beginning of the text.\r\nChange all multiple consecutive blanks to single blanks. \r\nRemove first character from text (must be a blank).\r\nRepeat until done: \r\n   Extract first character into result, remove from text.\r\n   Locate first blank in text, remove all characters up to that position.\r\nRepeat end\r\nChange all punctuation marks from \"blank punctuation blank\" to \"punctuation\".  \r\nChange selected punctuation marks to \"punctuation blank\" (periods, commas, etc.)  " NoEmbeddedVars="TRUE"/>
<TEXT BOX DISPLAY Title="Diagnostics -- show text" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0"/>
<IF VARIABLE Variable="%text%" Condition="\x06" Value="%CRLF%" IgnoreCase="FALSE" _ENABLED="FALSE"/>
<TEXT BOX DISPLAY Title="Diagnostics" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 text contains crlf\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="278" Height="200" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<END IF _ENABLED="FALSE"/>
<COMMENT Value="Insert one blank at beginning of text "/>
<VARIABLE SET STRING Option="\x00" Destination="%temptext%" Value=" " NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x07" Destination="%temptext%" Variable="%text%" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x08" Destination="%text%" Variable="%temptext%" NoEmbeddedVars="TRUE"/>
<TEXT BOX DISPLAY Title="Diagnostics -- text with extra blank at beginning" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<COMMENT Value="Change all multiple consecutive blanks to single blanks.  "/>
<REPEAT START Start="1" Step="1" Count="20" Save="FALSE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="  " ReplaceWith=" " All="TRUE" IgnoreCase="TRUE" NoEmbeddedVars="FALSE"/>
<END REPEAT/>
<TEXT BOX DISPLAY Title="Diagnostics -- text with multi blanks compressed to single blanks" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<COMMENT Value="Surround punctuation marks with blanks "/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="." ReplaceWith=" . " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="," ReplaceWith=" , " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="-" ReplaceWith=" - " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="!" ReplaceWith=" ! " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="(" ReplaceWith=" ( " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace=")" ReplaceWith=" ) " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace=" \"" ReplaceWith=" %DC1% " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="change left double-quote to DC1"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="\"" ReplaceWith=" %DC2% " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="change right double-quote to DC2 (all remaining double-quotes)"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%text%" ToReplace="%CRLF%" ReplaceWith=" %BEL% " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="Change two-character sequence CRLF to single character BEL"/>
<TEXT BOX DISPLAY Title="Diagnostics -- text after punctuation expansion" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0"/>
<COMMENT Value="Remove first character from text "/>
<VARIABLE MODIFY STRING Option="\x0A" Destination="%text%" Start="1" Count="1"/>
<TEXT BOX DISPLAY Title="Diagnostics -- text after deleting leading blank" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<COMMENT Value="Extract first character of each word -- punctuation marks count as words"/>
<VARIABLE SET STRING Option="\x00" Destination="%result%" NoEmbeddedVars="FALSE" _COMMENT="Clear result text"/>
<VARIABLE MODIFY STRING Option="\x06" Destination="%text%" Value=" !@#$%^&*()" NoEmbeddedVars="FALSE" _COMMENT="Append garbage to end of text to identify where the end is"/>
<REPEAT UNTIL Variable="%text%" Condition="\x00" Value=" !@#$%^&*()"/>
<VARIABLE MODIFY STRING Option="\x09" Destination="%onechar%" Variable="%text%" Start="1" Count="1" NoEmbeddedVars="TRUE" _COMMENT="One character from text into temporary area"/>
<VARIABLE MODIFY STRING Option="\x07" Destination="%result%" Variable="%onechar%" NoEmbeddedVars="TRUE" _COMMENT="Append extracted character to result"/>
<VARIABLE MODIFY STRING Option="\x06" Destination="%result%" Value=" " NoEmbeddedVars="FALSE" _COMMENT="Append one blank to result"/>
<VARIABLE SET INTEGER Option="\x0E" Destination="%index%" Text_Variable="%text%" Text=" " Ignore_Case="TRUE"/>
<VARIABLE MODIFY STRING Option="\x0A" Destination="%text%" Start="1" Count="%index%" _COMMENT="Delete from text up to and including next blank space"/>
<END REPEAT/>
<TEXT BOX DISPLAY Title="Diagnostics -- text after processing to end" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %text%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<TEXT BOX DISPLAY Title="Diagnostics -- result text with one character per word" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %result%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0"/>
<COMMENT Value="Change all multiple consecutive blanks to single blanks.  "/>
<COMMENT Value="Adjust selected punctuation marks "/>
<REPEAT START Start="1" Step="1" Count="20" Save="FALSE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" ." ReplaceWith=". " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="period"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" ," ReplaceWith=", " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="comma"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" -" ReplaceWith="-" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="hyphen"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="- " ReplaceWith="-" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="hyphen"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" !" ReplaceWith="! " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="exclamation point"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="!" ReplaceWith="! " All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="exclamation point"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="( " ReplaceWith=" (" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="left paren"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="(" ReplaceWith=" (" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="left paren"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" )" ReplaceWith=")" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="TRUE" _COMMENT="right paren"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="%DC1% " ReplaceWith=" \"" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="left double-quote"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace=" %DC2% " ReplaceWith="\"" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="right double-quote"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="%BEL% " ReplaceWith="%CRLF%" All="TRUE" IgnoreCase="FALSE" NoEmbeddedVars="FALSE" _COMMENT="Change BEL characters back to CRLF"/>
<END REPEAT/>
<REPEAT START Start="1" Step="1" Count="20" Save="FALSE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%result%" ToReplace="  " ReplaceWith=" " All="TRUE" IgnoreCase="TRUE" NoEmbeddedVars="FALSE" _COMMENT="double space to single space"/>
<END REPEAT/>
<IF VARIABLE Variable="%result%" Condition="\x06" Value="%CRLF%" IgnoreCase="FALSE" _ENABLED="FALSE"/>
<TEXT BOX DISPLAY Title="Diagnostics" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 result contains crlf\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="278" Height="200" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0" _ENABLED="FALSE"/>
<END IF _ENABLED="FALSE"/>
<TEXT BOX DISPLAY Title="Diagnostics -- final result after punctuation adjustment" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Tahoma;}{\\f1\\fnil Tahoma;}}\r\n\\viewkind4\\uc1\\pard\\f0\\fs20 %result%\\f1 \r\n\\par }\r\n" Left="Center" Top="Center" Width="1079" Height="409" Monitor="0" OnTop="FALSE" Keep_Focus="TRUE" Mode="\x00" Delay="0"/>
<COMMENT Value=" "/>
<COMMENT Value=" "/>
<MACRO RETURN/>
<COMMENT Value=" "/>
<COMMENT Value=" "/>

 

Link to comment
Share on other sites

Wow! What a lot of work you funneled into this challenge. It's impressive that you managed to hold so much complexity together.


I tested your code with real-world examples, and it worked... except for one tiny miss. Not all digits are getting preserved. Your script seems to treat a string of digits like a string of letters, so "1234 Main St." emerges as "1 M S." instead of "1234 M S."

 

Your approach of appending a "garbage" string to the end of a text, and then processing the text until the garbage string is reached, is eminently steal-able. Clever. I'm going to try that!

Link to comment
Share on other sites

15 minutes ago, acantor said:

Your approach of appending a "garbage" string to the end of a text, and then processing the text until the garbage string is reached, is eminently steal-able. Clever. I'm going to try that!

That's my invention that goes back 50+ years for syncing multiple sequentially-sorted tape files, where category breaks occurred every so often requiring totals processing and resetting of counters for the next category.  It can save a lot of convoluted logic otherwise needed, especially with three or more tapes.  I'm sure I was not the first nor the last to discover the technique -- with account numbers the "garbage" was generally just a string of 9999999....

Link to comment
Share on other sites

Here's my solution. I had two aha! "lightbulb" moments as I worked on the problem.

 

1. I realized that characters are either alphabetical letters, or they are not alphabetical letters. Call the first group "letters" and the second group "non-letters." The latter includes punctuation, symbols, spaces, Tabs, CRs, etc.

 

A word always starts with a "non-letter" before a "letter":

 

"Hello

(Hello

[space]Hello

[Tab]Hello

[New Line]Hello

 

2. I realized that it isn't necessary to convert characters to ASCII codes. Instead, I defined a list of alphabetical characters:

 

Let %Alphabet% = "abc...xyz"

 

Test if %Alphabet% CONTAINS a character. For example, if %Alphabet% contains "m" then the character must be a letter. If %Alphabet% doesn't contain a "letter" -- e.g., a space, digit, or punctuation mark -- it must be a non-letter.

 

Eventually, I expanded %Alphabet% to include letters that appear in borrow-words like voilà, über, and señor.

 

The "cringe-worthy" concession was being forced to add a non-letter to the start of the string. The macro inspects each character; if a character is a "letter," then it inspects the previous character. But while the script is inspecting the first character in the string, there is no previous character. So the character must be added to the left before the string is processed, and stripped off after the string has been processed. It's ugly, but I couldn't think of a clean workaround!

 

This script has at least one limitation. A convention in academic writing is to use brackets or parentheses in the middle of a word to denote letter substitutions. Usually the capitalization has been changed in a quotation. For example:

 

Smith wrote, "[t]here is no way."

 

The macro changes it to S w, "[t]h i n w." instead of S w, "[t] i n w."

 

But I view this "failure" as totally inconsequential for the needs of my friend the storyteller!

 

My first effort, before the aha! moments, involved about 50 lines of spaghetti code. This version has only 21. I came up with a 19-line version, but it was harder to follow. I decided clarity should trump brevity.

 

// Delete all but the first letter of every word. Preserve spaces, digits, punctuation, line breaks, etc.
 
// How this script decides a character is the first letter in a word
// Assume every character belongs to one of two groups:
// 1. "Letters." I've included 54 letters from English, French, Spanish, Italian, German, & Dutch alphabets
// 2. "Non-letters." Everything that is not a letter: digits, punctuations, spaces, CRs, LFs, etc.
 
// The first character of a word is always a "letter" and
// The character to its left is a "non-letter"
 
Variable Set String %Alphabet% to "abcedefghijklmnopqrstuvwxyzáàâäæçéèêëíìîïñóòôöœßúùûüýÿ"
 
Clipboard Copy
 
Variable Set String %Clip% from the clipboard contents
Variable Set String %Clip% to " %Clip%" // Prepend a space: need a non-letter to the left of character 1
 
// Check each character, one at a time, from left to right
Variable Set Integer %NumberOfCharacters% to the length of variable %Clip%
 
Repeat Start (Repeat %NumberOfCharacters% times)
  Variable Modify String: Copy part of text in %Clip% starting at %Count% and 1 characters long to %Char%
  If Variable %Alphabet% Contains "%Char%"
  // The CURRENT character is a letter, so check the PREVIOUS character
    Variable Modify Integer: %PrevCount% = %Count% - 1
    Variable Modify String: Copy part of text in %Clip% starting at %PrevCount% and 1 characters long to %PrevChar%
    If Variable %Alphabet% Does not Contain "%PrevChar%"
    // The PREVIOUS character is a non-letter. Ergo, the CURRENT character is the first letter of a word
      Variable Set String %Result% to "%Result%%Char%" // Append current character to %Result%
    End If
  Else
  // The CURRENT character is not a letter, so we want to keep it
    Variable Set String %Result% to "%Result%%Char%" // Append current character to %Result%
  End If
  Variable Modify Integer %Count%: Increment
End Repeat
 
Variable Modify String: Delete part of text from %Result% starting at 1 and 1 characters long // Delete the space added above
 
Text Type (Simulate Keystrokes): <ARROW RIGHT><ENTER><ENTER> // Deselect and add two blank lines
Text Type (Use Clipboard and Paste Text): %Result%

  
<COMMENT Value="Delete all but the first letter of every word. Preserve spaces, digits, punctuation, line breaks, etc."/>
<COMMENT/>
<COMMENT Value="How this script decides a character is the first letter in a word"/>
<COMMENT Value="Assume every character belongs to one of two groups:"/>
<COMMENT Value="1. \"Letters.\" I've included 54 letters from English, French, Spanish, Italian, German, & Dutch alphabets"/>
<COMMENT Value="2. \"Non-letters.\" Everything that is not a letter: digits, punctuations, spaces, CRs, LFs, etc."/>
<COMMENT/>
<COMMENT Value="The first character of a word is always a \"letter\" and"/>
<COMMENT Value="The character to its left is a \"non-letter\""/>
<COMMENT/>
<VARIABLE SET STRING Option="\x00" Destination="%Alphabet%" Value="abcedefghijklmnopqrstuvwxyzáàâäæçéèêëíìîïñóòôöœßúùûüýÿ" NoEmbeddedVars="FALSE"/>
<COMMENT/>
<CLIPBOARD COPY/>
<COMMENT/>
<VARIABLE SET STRING Option="\x02" Destination="%Clip%" NoEmbeddedVars="FALSE"/>
<VARIABLE SET STRING Option="\x00" Destination="%Clip%" Value=" %Clip%" NoEmbeddedVars="FALSE" _COMMENT="Prepend a space: need a non-letter to the left of character 1"/>
<COMMENT/>
<COMMENT Value="Check each character, one at a time, from left to right"/>
<VARIABLE SET INTEGER Option="\x0D" Destination="%NumberOfCharacters%" Text_Variable="%Clip%"/>
<COMMENT/>
<REPEAT START Start="1" Step="1" Count="%NumberOfCharacters%" Save="TRUE" Variable="%Count%"/>
<VARIABLE MODIFY STRING Option="\x09" Destination="%Char%" Variable="%Clip%" Start="%Count%" Count="1" NoEmbeddedVars="FALSE"/>
<IF VARIABLE Variable="%Alphabet%" Condition="\x06" Value="%Char%" IgnoreCase="TRUE"/>
<COMMENT Value="The CURRENT character is a letter, so check the PREVIOUS character"/>
<VARIABLE MODIFY INTEGER Option="\x01" Destination="%PrevCount%" Value1="%Count%" Value2="1"/>
<VARIABLE MODIFY STRING Option="\x09" Destination="%PrevChar%" Variable="%Clip%" Start="%PrevCount%" Count="1" NoEmbeddedVars="FALSE"/>
<IF VARIABLE Variable="%Alphabet%" Condition="\x07" Value="%PrevChar%" IgnoreCase="TRUE"/>
<COMMENT Value="The PREVIOUS character is a non-letter. Ergo, the CURRENT character is the first letter of a word"/>
<VARIABLE SET STRING Option="\x00" Destination="%Result%" Value="%Result%%Char%" NoEmbeddedVars="FALSE" _COMMENT="Append current character to %Result%"/>
<END IF/>
<ELSE/>
<COMMENT Value="The CURRENT character is not a letter, so we want to keep it"/>
<VARIABLE SET STRING Option="\x00" Destination="%Result%" Value="%Result%%Char%" NoEmbeddedVars="FALSE" _COMMENT="Append current character to %Result%"/>
<END IF/>
<VARIABLE MODIFY INTEGER Option="\x07" Destination="%Count%"/>
<END REPEAT/>
<COMMENT/>
<VARIABLE MODIFY STRING Option="\x0A" Destination="%Result%" Start="1" Count="1" _COMMENT="Delete the space added above"/>
<COMMENT/>
<TEXT TYPE Action="0" Text="<ARROW RIGHT><ENTER><ENTER>" _COMMENT="Deselect and add two blank lines"/>
<TEXT TYPE Action="1" Text="%Result%"/>

 

Link to comment
Share on other sites

1 hour ago, acantor said:

The "cringe-worthy" concession was being forced to add a non-letter to the start of the string.

I object.  That is not the least bit cringe-worthy, no more than my appending a garbage string of characters to denote the end of the text.  Something like that can greatly simplify the convoluted logic that otherwise might be needed.  As long as you document it so the next programmer won't be confused by the technique, it's praise-worthy rather than cringe-worthy. 

 

I like your solution better than mine -- my whole difficulty was special treatment of punctuation, and you avoided that issue.  Sometimes I can make complex problems simple, and other times I can make simple problems complex.🙃

Link to comment
Share on other sites

Sigh. You're probably right, rberq. I guess there's nothing wrong with boundary conditions and special cases, especially when they are inherent to a problem!

 

I did think of two workarounds; but as you point out, workarounds have the potential to introduce additional complexity and convoluted logic: 1. Deal with the first character, and then move on to the main repeat loop to deal with the second through to the last character. 2. Parse the string from the end instead of the beginning. But that would mean spitting out the result in reverse order, which is do-able with Macro Express; but then we've added complexity. Or I can imagine strings that consist of an odd number of characters may need to be handled differently than strings than consist of an even number of characters. All of a sudden, that single space I slipped into the beginning of the string doesn't seem so bad.

 

Quote

Sometimes I can make complex problems simple, and other times I can make simple problems complex.

 

Me too!

 

Link to comment
Share on other sites

  • 3 weeks later...

I haven't been on the forums in a while, but just wandered in over the weekend and saw this challenge. And put myself to work on it. This is what I came up with.  I didn't look at the other posts until after I had gotten this one working, and I already see that mine is probably not as good or elegant as others already posted, and while there are some things that came out the same, I did take a slightly different path. I sure enjoyed the challenge! 😀

<COMMENT Value="Setting it All Up"/>
<VARIABLE SET STRING Option="\x00" Destination="%Alphabet%" Value="abcdefghijklmnopqrstuvwxyz" NoEmbeddedVars="FALSE" _COMMENT="These are the characters we will ONLY keep if they are at the beginning of a word, or are following a special character."/>
<VARIABLE SET STRING Option="\x02" Destination="%Unedited%" NoEmbeddedVars="FALSE" _COMMENT="Copy the Original Text to clipboard before Running this macro."/>
<VARIABLE SET TO ASCII CHAR Value="13" Destination="%CR%"/>
<VARIABLE SET TO ASCII CHAR Value="10" Destination="%LF%"/>
<VARIABLE SET STRING Option="\x00" Destination="%CRLF%" Value="%CR%%LF%" NoEmbeddedVars="FALSE"/>
<VARIABLE MODIFY STRING Option="\x0F" Destination="%Unedited%" ToReplace="%CR%%LF%" ReplaceWith="¬" All="TRUE" IgnoreCase="TRUE" NoEmbeddedVars="FALSE" _COMMENT="Replacing carriage returns and linefeeds with a nonsense character."/>
<VARIABLE SET INTEGER Option="\x0D" Destination="%SizeOfUnedited%" Text_Variable="%Unedited%" _COMMENT="Length of the text will tell the macro when to stop."/>
<VARIABLE MODIFY STRING Option="\x09" Destination="%FirstLetter%" Variable="%Unedited%" Start="1" Count="1" NoEmbeddedVars="FALSE" _COMMENT="Assuming the first character will always be kept, so not including it inside the Repeat function."/>
<VARIABLE SET STRING Option="\x00" Destination="%LastLetter%" Value="%FirstLetter%" NoEmbeddedVars="FALSE"/>
<COMMENT Value="//"/>
<COMMENT Value="Putting the \"Kept\" characters/symbols directly into a text file rather than a variable so I can troubleshoot in 'real time' with a File Explorer with Preview enabled."/>
<VARIABLE MODIFY STRING Option="\x11" Destination="%FirstLetter%" Filename="D:\\Edited Story.txt" Strip="FALSE" NoEmbeddedVars="FALSE" _COMMENT="This command creates the .txt file from scratch, even if it previously existed from prior runs."/>
<VARIABLE SET BOOL Destination="%UseIt%" Command="263" Value="FALSE"/>
<COMMENT Value="Start the Filter"/>
<REPEAT START Start="1" Step="1" Count="%SizeOfUnedited%" Save="FALSE"/>
<VARIABLE MODIFY STRING Option="\x0A" Destination="%unedited%" Start="1" Count="1" _COMMENT="Remove the most recent character from consideration."/>
<VARIABLE MODIFY STRING Option="\x09" Destination="%NextLetter%" Variable="%Unedited%" Start="1" Count="1" NoEmbeddedVars="FALSE" _COMMENT="Check the next character"/>
<COMMENT Value="Step 1:\r\nIf the character is in the Alphabet, %UseIt% will be False\r\n\r\nIf it is not in the Alpahbet, %UseIt% will be True and %special% will retain that character for future consideration."/>
<IF VARIABLE Variable="%Alphabet%" Condition="\x07" Value="%NextLetter%" IgnoreCase="TRUE"/>
<VARIABLE SET BOOL Destination="%UseIt%" Command="263" Value="TRUE"/>
<VARIABLE SET STRING Option="\x00" Destination="%special%" Value="%NextLetter%" NoEmbeddedVars="FALSE"/>
<ELSE _COMMENT="Step 1a)\r\nI struggled with differentiating spaces from other special characters. I may have found a way around it, but am leaving this in as an OR statement."/>
<IF VARIABLE Variable="%LastLetter%" Condition="\x00" Value=" " IgnoreCase="FALSE"/>
<OR/>
<IF VARIABLE Variable="%LastLetter%" Condition="\x00" Value="%special%" IgnoreCase="FALSE"/>
<VARIABLE SET BOOL Destination="%UseIt%" Command="263" Value="TRUE"/>
<ELSE/>
<VARIABLE SET BOOL Destination="%UseIt%" Command="263" Value="FALSE"/>
<END IF/>
<END IF/>
<COMMENT Value="If our Boolean is True, we want to keep the special character only if it is NOT a carriage return. I found that while a carriage return shows up in String variable as \"\", it was not treated as \"\" when performing If Variable checks."/>
<IF VARIABLE Variable="%UseIt%" Condition="\x00" Value="True" IgnoreCase="FALSE"/>
<AND/>
<IF VARIABLE Variable="%NextLetter%" Condition="\x01" Value="¬" IgnoreCase="FALSE"/>
<VARIABLE MODIFY STRING Option="\x12" Destination="%NextLetter%" Filename="D:\\Edited Story.txt" Strip="FALSE" NoEmbeddedVars="FALSE"/>
<END IF/>
<IF VARIABLE Variable="%NextLetter%" Condition="\x00" Value="¬" IgnoreCase="FALSE" _COMMENT="And here we re-add the CR/LF back to the .txt file so our output can maintain its structural integrity."/>
<VARIABLE MODIFY STRING Option="\x12" Destination="%CRLF%" Filename="D:\\Edited Story.txt" Strip="FALSE" NoEmbeddedVars="FALSE"/>
<END IF/>
<VARIABLE SET STRING Option="\x00" Destination="%LastLetter%" Value="%NextLetter%" NoEmbeddedVars="FALSE"/>
<VARIABLE SET BOOL Destination="%UseIt%" Command="263" Value="FALSE"/>
<END REPEAT/>
<TEXT BOX DISPLAY Title="Modifications Complete" Content="{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang1033{\\fonttbl{\\f0\\fnil\\fcharset0 Lucida Fax;}}\r\n{\\colortbl ;\\red128\\green0\\blue0;}\r\n\\viewkind4\\uc1\\pard\\cf1\\b\\f0\\fs44 Done!\\cf0\\b0 \r\n\\par }\r\n" Left="Center" Top="Center" Width="278" Height="200" Monitor="0" OnTop="TRUE" Keep_Focus="TRUE" Mode="\x00" Delay="0"/>

 

Link to comment
Share on other sites

Hi Steve,

 

It's been years since you've haunted this Forum. Welcome back!

 

I really appreciate the comments you peppered in your solution, which helped clarify your intentions. It's interesting that the approach you settled on is a hybrid between rberq's approach and mine. There really are many ways to solve this problem.

 

For me, the "Holy Grail" to cracking this puzzle would be to use Regular Expressions. I tried, got close to a solution, but I couldn't figure it out. 

Link to comment
Share on other sites

Thanks, acantor :) 

 

I peek in from time to time when I need a quick refresher on how to do something and don't feel like going through trial and error to figure it out for myself. Especially when I know it's something that should be simple. But I so rarely have enough free time to participate anymore. Oddly enough I saw a string of posts with "challenge" in the title and that motivated me to find some time (it was Easter weekend... I made time to relax).

 

I really wanted to find a way to incorporate ASCII or Text File Begin/End loops, but gave up on that approach right away. It's been too long since I've had to build one from scratch, and don't use them very frequently anymore in my day-to-day macros (not sure if any of my current day-to-days even use them). Since we're not playing with formatted, structured tables or whatever, I don't think they would have really been much use anyway.

 

I don't think I'd ever used a Boolean variable in a macro before, but as I was attacking this problem, Boolean just made sense. The real struggle for me was the non-space special characters. I'd have it almost dialed in except it would either ignore (or duplicate!) any text following a special character. I'd come up with something that I was 99% sure would fix the problem... and it generally fixed the problem by ignoring the rest of the macro. I came up with some pretty complex ways to generate a text file that was identical to the original text :D .

 

I may take a look at some of your earlier challenge posts just to see what's there and throw in my ideas (if I have any). 

 

From prior experience, if Cory takes a stab at a problem, he inevitably comes up with the superior solution.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...