h8rt3rmin8r

regex_google.txt

Feb 2nd, 2020
168
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. RE2 regular expression syntax reference
  2. -------------------------­-------­-----
  3.  
  4. Single characters:
  5. . any character, possibly including newline (s=true)
  6. [xyz] character class
  7. [^xyz] negated character class
  8. \d Perl character class
  9. \D negated Perl character class
  10. [[:alpha:]] ASCII character class
  11. [[:^alpha:]] negated ASCII character class
  12. \pN Unicode character class (one-letter name)
  13. \p{Greek} Unicode character class
  14. \PN negated Unicode character class (one-letter name)
  15. \P{Greek} negated Unicode character class
  16.  
  17. Composites:
  18. xy «x» followed by «y»
  19. x|y «x» or «y» (prefer «x»)
  20.  
  21. Repetitions:
  22. x* zero or more «x», prefer more
  23. x+ one or more «x», prefer more
  24. x? zero or one «x», prefer one
  25. x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
  26. x{n,} «n» or more «x», prefer more
  27. x{n} exactly «n» «x»
  28. x*? zero or more «x», prefer fewer
  29. x+? one or more «x», prefer fewer
  30. x?? zero or one «x», prefer zero
  31. x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
  32. x{n,}? «n» or more «x», prefer fewer
  33. x{n}? exactly «n» «x»
  34. x{} (== x*) NOT SUPPORTED vim
  35. x{-} (== x*?) NOT SUPPORTED vim
  36. x{-n} (== x{n}?) NOT SUPPORTED vim
  37. x= (== x?) NOT SUPPORTED vim
  38.  
  39. Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
  40. reject forms that create a minimum or maximum repetition count above 1000.
  41. Unlimited repetitions are not subject to this restriction.
  42.  
  43. Possessive repetitions:
  44. x*+ zero or more «x», possessive NOT SUPPORTED
  45. x++ one or more «x», possessive NOT SUPPORTED
  46. x?+ zero or one «x», possessive NOT SUPPORTED
  47. x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
  48. x{n,}+ «n» or more «x», possessive NOT SUPPORTED
  49. x{n}+ exactly «n» «x», possessive NOT SUPPORTED
  50.  
  51. Grouping:
  52. (re) numbered capturing group (submatch)
  53. (?P<name>re) named & numbered capturing group (submatch)
  54. (?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED
  55. (?'name're) named & numbered capturing group (submatch) NOT SUPPORTED
  56. (?:re) non-capturing group
  57. (?flags) set flags within current group; non-capturing
  58. (?flags:re) set flags during re; non-capturing
  59. (?#text) comment NOT SUPPORTED
  60. (?|x|y|z) branch numbering reset NOT SUPPORTED
  61. (?>re) possessive match of «re» NOT SUPPORTED
  62. re@> possessive match of «re» NOT SUPPORTED vim
  63. %(re) non-capturing group NOT SUPPORTED vim
  64.  
  65. Flags:
  66. i case-insensitive (default false)
  67. m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
  68. s let «.» match «\n» (default false)
  69. U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
  70. Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
  71.  
  72. Empty strings:
  73. ^ at beginning of text or line («m»=true)
  74. $ at end of text (like «\z» not «\Z») or line («m»=true)
  75. \A at beginning of text
  76. \b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
  77. \B not at ASCII word boundary
  78. \G at beginning of subtext being searched NOT SUPPORTED pcre
  79. \G at end of last match NOT SUPPORTED perl
  80. \Z at end of text, or before newline at end of text NOT SUPPORTED
  81. \z at end of text
  82. (?=re) before text matching «re» NOT SUPPORTED
  83. (?!re) before text not matching «re» NOT SUPPORTED
  84. (?<=re) after text matching «re» NOT SUPPORTED
  85. (?<!re) after text not matching «re» NOT SUPPORTED
  86. re& before text matching «re» NOT SUPPORTED vim
  87. re@= before text matching «re» NOT SUPPORTED vim
  88. re@! before text not matching «re» NOT SUPPORTED vim
  89. re@<= after text matching «re» NOT SUPPORTED vim
  90. re@<! after text not matching «re» NOT SUPPORTED vim
  91. \zs sets start of match (= \K) NOT SUPPORTED vim
  92. \ze sets end of match NOT SUPPORTED vim
  93. \%^ beginning of file NOT SUPPORTED vim
  94. \%$ end of file NOT SUPPORTED vim
  95. \%V on screen NOT SUPPORTED vim
  96. \%# cursor position NOT SUPPORTED vim
  97. \%'m mark «m» position NOT SUPPORTED vim
  98. \%23l in line 23 NOT SUPPORTED vim
  99. \%23c in column 23 NOT SUPPORTED vim
  100. \%23v in virtual column 23 NOT SUPPORTED vim
  101.  
  102. Escape sequences:
  103. \a bell (== \007)
  104. \f form feed (== \014)
  105. \t horizontal tab (== \011)
  106. \n newline (== \012)
  107. \r carriage return (== \015)
  108. \v vertical tab character (== \013)
  109. \* literal «*», for any punctuation character «*»
  110. \123 octal character code (up to three digits)
  111. \x7F hex character code (exactly two digits)
  112. \x{10FFFF} hex character code
  113. \C match a single byte even in UTF-8 mode
  114. \Q...\E literal text «...» even if «...» has punctuation
  115.  
  116. \1 backreference NOT SUPPORTED
  117. \b backspace NOT SUPPORTED (use «\010»)
  118. \cK control char ^K NOT SUPPORTED (use «\001» etc)
  119. \e escape NOT SUPPORTED (use «\033»)
  120. \g1 backreference NOT SUPPORTED
  121. \g{1} backreference NOT SUPPORTED
  122. \g{+1} backreference NOT SUPPORTED
  123. \g{-1} backreference NOT SUPPORTED
  124. \g{name} named backreference NOT SUPPORTED
  125. \g<name> subroutine call NOT SUPPORTED
  126. \g'name' subroutine call NOT SUPPORTED
  127. \k<name> named backreference NOT SUPPORTED
  128. \k'name' named backreference NOT SUPPORTED
  129. \lX lowercase «X» NOT SUPPORTED
  130. \ux uppercase «x» NOT SUPPORTED
  131. \L...\E lowercase text «...» NOT SUPPORTED
  132. \K reset beginning of «$0» NOT SUPPORTED
  133. \N{name} named Unicode character NOT SUPPORTED
  134. \R line break NOT SUPPORTED
  135. \U...\E upper case text «...» NOT SUPPORTED
  136. \X extended Unicode sequence NOT SUPPORTED
  137.  
  138. \%d123 decimal character 123 NOT SUPPORTED vim
  139. \%xFF hex character FF NOT SUPPORTED vim
  140. \%o123 octal character 123 NOT SUPPORTED vim
  141. \%u1234 Unicode character 0x1234 NOT SUPPORTED vim
  142. \%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
  143.  
  144. Character class elements:
  145. x single character
  146. A-Z character range (inclusive)
  147. \d Perl character class
  148. [:foo:] ASCII character class «foo»
  149. \p{Foo} Unicode character class «Foo»
  150. \pF Unicode character class «F» (one-letter name)
  151.  
  152. Named character classes as character class elements:
  153. [\d] digits (== \d)
  154. [^\d] not digits (== \D)
  155. [\D] not digits (== \D)
  156. [^\D] not not digits (== \d)
  157. [[:name:]] named ASCII class inside character class (== [:name:])
  158. [^[:name:]] named ASCII class inside negated character class (== [:^name:])
  159. [\p{Name}] named Unicode property inside character class (== \p{Name})
  160. [^\p{Name}] named Unicode property inside negated character class (== \P{Name})
  161.  
  162. Perl character classes (all ASCII-only):
  163. \d digits (== [0-9])
  164. \D not digits (== [^0-9])
  165. \s whitespace (== [\t\n\f\r ])
  166. \S not whitespace (== [^\t\n\f\r ])
  167. \w word characters (== [0-9A-Za-z_])
  168. \W not word characters (== [^0-9A-Za-z_])
  169.  
  170. \h horizontal space NOT SUPPORTED
  171. \H not horizontal space NOT SUPPORTED
  172. \v vertical space NOT SUPPORTED
  173. \V not vertical space NOT SUPPORTED
  174.  
  175. ASCII character classes:
  176. [[:alnum:]] alphanumeric (== [0-9A-Za-z])
  177. [[:alpha:]] alphabetic (== [A-Za-z])
  178. [[:ascii:]] ASCII (== [\x00-\x7F])
  179. [[:blank:]] blank (== [\t ])
  180. [[:cntrl:]] control (== [\x00-\x1F\x7F])
  181. [[:digit:]] digits (== [0-9])
  182. [[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
  183. [[:lower:]] lower case (== [a-z])
  184. [[:print:]] printable (== [ -~] == [ [:graph:]])
  185. [[:punct:]] punctuation (== [!-/:-@[-`{-~])
  186. [[:space:]] whitespace (== [\t\n\v\f\r ])
  187. [[:upper:]] upper case (== [A-Z])
  188. [[:word:]] word characters (== [0-9A-Za-z_])
  189. [[:xdigit:]] hex digit (== [0-9A-Fa-f])
  190.  
  191. Unicode character class names--general category:
  192. C other
  193. Cc control
  194. Cf format
  195. Cn unassigned code points NOT SUPPORTED
  196. Co private use
  197. Cs surrogate
  198. L letter
  199. LC cased letter NOT SUPPORTED
  200. L& cased letter NOT SUPPORTED
  201. Ll lowercase letter
  202. Lm modifier letter
  203. Lo other letter
  204. Lt titlecase letter
  205. Lu uppercase letter
  206. M mark
  207. Mc spacing mark
  208. Me enclosing mark
  209. Mn non-spacing mark
  210. N number
  211. Nd decimal number
  212. Nl letter number
  213. No other number
  214. P punctuation
  215. Pc connector punctuation
  216. Pd dash punctuation
  217. Pe close punctuation
  218. Pf final punctuation
  219. Pi initial punctuation
  220. Po other punctuation
  221. Ps open punctuation
  222. S symbol
  223. Sc currency symbol
  224. Sk modifier symbol
  225. Sm math symbol
  226. So other symbol
  227. Z separator
  228. Zl line separator
  229. Zp paragraph separator
  230. Zs space separator
  231.  
  232. Unicode character class names--scripts:
  233. Adlam
  234. Ahom
  235. Anatolian_Hieroglyphs
  236. Arabic
  237. Armenian
  238. Avestan
  239. Balinese
  240. Bamum
  241. Bassa_Vah
  242. Batak
  243. Bengali
  244. Bhaiksuki
  245. Bopomofo
  246. Brahmi
  247. Braille
  248. Buginese
  249. Buhid
  250. Canadian_Aboriginal
  251. Carian
  252. Caucasian_Albanian
  253. Chakma
  254. Cham
  255. Cherokee
  256. Common
  257. Coptic
  258. Cuneiform
  259. Cypriot
  260. Cyrillic
  261. Deseret
  262. Devanagari
  263. Dogra
  264. Duployan
  265. Egyptian_Hieroglyphs
  266. Elbasan
  267. Elymaic
  268. Ethiopic
  269. Georgian
  270. Glagolitic
  271. Gothic
  272. Grantha
  273. Greek
  274. Gujarati
  275. Gunjala_Gondi
  276. Gurmukhi
  277. Han
  278. Hangul
  279. Hanifi_Rohingya
  280. Hanunoo
  281. Hatran
  282. Hebrew
  283. Hiragana
  284. Imperial_Aramaic
  285. Inherited
  286. Inscriptional_Pahlavi
  287. Inscriptional_Parthian
  288. Javanese
  289. Kaithi
  290. Kannada
  291. Katakana
  292. Kayah_Li
  293. Kharoshthi
  294. Khmer
  295. Khojki
  296. Khudawadi
  297. Lao
  298. Latin
  299. Lepcha
  300. Limbu
  301. Linear_A
  302. Linear_B
  303. Lisu
  304. Lycian
  305. Lydian
  306. Mahajani
  307. Makasar
  308. Malayalam
  309. Mandaic
  310. Manichaean
  311. Marchen
  312. Masaram_Gondi
  313. Medefaidrin
  314. Meetei_Mayek
  315. Mende_Kikakui
  316. Meroitic_Cursive
  317. Meroitic_Hieroglyphs
  318. Miao
  319. Modi
  320. Mongolian
  321. Mro
  322. Multani
  323. Myanmar
  324. Nabataean
  325. Nandinagari
  326. New_Tai_Lue
  327. Newa
  328. Nko
  329. Nushu
  330. Nyiakeng_Puachue_Hmong
  331. Ogham
  332. Ol_Chiki
  333. Old_Hungarian
  334. Old_Italic
  335. Old_North_Arabian
  336. Old_Permic
  337. Old_Persian
  338. Old_Sogdian
  339. Old_South_Arabian
  340. Old_Turkic
  341. Oriya
  342. Osage
  343. Osmanya
  344. Pahawh_Hmong
  345. Palmyrene
  346. Pau_Cin_Hau
  347. Phags_Pa
  348. Phoenician
  349. Psalter_Pahlavi
  350. Rejang
  351. Runic
  352. Samaritan
  353. Saurashtra
  354. Sharada
  355. Shavian
  356. Siddham
  357. SignWriting
  358. Sinhala
  359. Sogdian
  360. Sora_Sompeng
  361. Soyombo
  362. Sundanese
  363. Syloti_Nagri
  364. Syriac
  365. Tagalog
  366. Tagbanwa
  367. Tai_Le
  368. Tai_Tham
  369. Tai_Viet
  370. Takri
  371. Tamil
  372. Tangut
  373. Telugu
  374. Thaana
  375. Thai
  376. Tibetan
  377. Tifinagh
  378. Tirhuta
  379. Ugaritic
  380. Vai
  381. Wancho
  382. Warang_Citi
  383. Yi
  384. Zanabazar_Square
  385.  
  386. Vim character classes:
  387. \i identifier character NOT SUPPORTED vim
  388. \I «\i» except digits NOT SUPPORTED vim
  389. \k keyword character NOT SUPPORTED vim
  390. \K «\k» except digits NOT SUPPORTED vim
  391. \f file name character NOT SUPPORTED vim
  392. \F «\f» except digits NOT SUPPORTED vim
  393. \p printable character NOT SUPPORTED vim
  394. \P «\p» except digits NOT SUPPORTED vim
  395. \s whitespace character (== [ \t]) NOT SUPPORTED vim
  396. \S non-white space character (== [^ \t]) NOT SUPPORTED vim
  397. \d digits (== [0-9]) vim
  398. \D not «\d» vim
  399. \x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
  400. \X not «\x» NOT SUPPORTED vim
  401. \o octal digits (== [0-7]) NOT SUPPORTED vim
  402. \O not «\o» NOT SUPPORTED vim
  403. \w word character vim
  404. \W not «\w» vim
  405. \h head of word character NOT SUPPORTED vim
  406. \H not «\h» NOT SUPPORTED vim
  407. \a alphabetic NOT SUPPORTED vim
  408. \A not «\a» NOT SUPPORTED vim
  409. \l lowercase NOT SUPPORTED vim
  410. \L not lowercase NOT SUPPORTED vim
  411. \u uppercase NOT SUPPORTED vim
  412. \U not uppercase NOT SUPPORTED vim
  413. \_x «\x» plus newline, for any «x» NOT SUPPORTED vim
  414.  
  415. Vim flags:
  416. \c ignore case NOT SUPPORTED vim
  417. \C match case NOT SUPPORTED vim
  418. \m magic NOT SUPPORTED vim
  419. \M nomagic NOT SUPPORTED vim
  420. \v verymagic NOT SUPPORTED vim
  421. \V verynomagic NOT SUPPORTED vim
  422. \Z ignore differences in Unicode combining characters NOT SUPPORTED vim
  423.  
  424. Magic:
  425. (?{code}) arbitrary Perl code NOT SUPPORTED perl
  426. (??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
  427. (?n) recursive call to regexp capturing group «n» NOT SUPPORTED
  428. (?+n) recursive call to relative group «+n» NOT SUPPORTED
  429. (?-n) recursive call to relative group «-n» NOT SUPPORTED
  430. (?C) PCRE callout NOT SUPPORTED pcre
  431. (?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
  432. (?&name) recursive call to named group NOT SUPPORTED
  433. (?P=name) named backreference NOT SUPPORTED
  434. (?P>name) recursive call to named group NOT SUPPORTED
  435. (?(cond)true|false) conditional branch NOT SUPPORTED
  436. (?(cond)true) conditional branch NOT SUPPORTED
  437. (*ACCEPT) make regexps more like Prolog NOT SUPPORTED
  438. (*COMMIT) NOT SUPPORTED
  439. (*F) NOT SUPPORTED
  440. (*FAIL) NOT SUPPORTED
  441. (*MARK) NOT SUPPORTED
  442. (*PRUNE) NOT SUPPORTED
  443. (*SKIP) NOT SUPPORTED
  444. (*THEN) NOT SUPPORTED
  445. (*ANY) set newline convention NOT SUPPORTED
  446. (*ANYCRLF) NOT SUPPORTED
  447. (*CR) NOT SUPPORTED
  448. (*CRLF) NOT SUPPORTED
  449. (*LF) NOT SUPPORTED
  450. (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
  451. (*BSR_UNICODE) NOT SUPPORTED pcre
RAW Paste Data