regex101/com is a useful diagnostic test site for regex expressions.
https://www.php.net/manual/en/function.preg-match.php is the php online manual page for preg_match. The manual is great if you are already an expert in php and therefore have no need of a manual. And totally frustrating if you are not.
Professional protectionism at its best.
To keep a search group and replace it with itself use () around the search group and \1 in replace for the first group \2 for the 2nd.
To seach for a string which does not contain the word <span use (?s)(?:(?!<span).)*? for a mulitline search or use (?:(?!<span).)*? for a single line search.
So to stick a carriage return and line feed after every <span>.....</span> that does not contain another (nested) <span>: Replace
<span((?s)(?:(?!<span).)*?)</span> with <span\1</span>\r\n
Single line version is <span((?:(?!<span).)*?)</span>. There is no need to escape the / in </span> by inserting a \ to make <\/span> in notepad++. One should however escape it in php. So that php just treats it as a
literal backslash,.
So to remove all <font size="2">...</font> tags from $webpage we do this in notepad++...
Search: <font size="2">((?s)(?:(?!<font).)*?)</font>
Replace: \1
This finds any pattern starting with <font size="2"> and ending with </font> and not having any other '<font' strings (i.e. any other font tags) within the pattern.
Or we do this in php...
$search = /'<font size="2">((?s)(?:(?!<font).)*?)</font>;
$replace =$1;
$webpage = preg_replace($search, $replace, $webpage);
^((?!bingo).)*$ finds every line which does not have the word 'bingo' in it
^(.*)(\r?\n\1)+$ finds duplicate whole lines.
(\b\S+\b)\s+\b\1\b finds immediately repeated words
Search: <font face="Lwhebg" size="6">((?s)(?:(?!<font).)*?)</font>
Replace: <q class="h">\1</q>
Replaces the long hand font face statement with a shorthand equivalent where the class "h" of q (normally used for quotes by hijacked for general use due to its one character tag) is that font in that size with the nomral features
of the <q> tag removed.
Search: <q class="h">([^¿<]+)À</q>
Replace: <q class="h">\1</q>\)
Find every h class q ending in an À but excluding a ¿ and a < and replaces the À (which is a lwheb right bracket) with a right bracket (in the paragraph font - the non lwheb font).
Search: [^ =\r\n]+ [^ =\r\n]+ = Finds every phrase to the left of = which has a space in it.
Search: ^((?:(?!</q> ::).)*?)\r\n Finds every line that does not have '</q> ::' in it.
Search: ^(LW[\d]{1,4} :: <q[^\r\n]+<b>Total</b> = )([\d]{1,5})<br>\r\n
Replace \2
\1\2<br>\r\n
This sticks the 'Total' number of incidences of the main entry word at the start of the lexicon entry for sorting the lexicon on frequency
Search: <q class="h">([^>]*)l¿([^<]*)</q>
Replace: <q class="h">\1lo\2</q>
This replaces l followed by ¿ which is a right bracket, with lo which is Lamed Holem in lwheb.
Search: <q class="h">([^>]*)([^a])o([^<]*)</q>
Replace: <q class="h">\1\2O\3</q>
Thjis replaces all incidences in Hebrew of 'o' not preceded by 'a', with O.
Search: \*\*([^\*]+)\*\*
Replace: <b>\1</b>
This replaces Grok's **something important** with the html bold version of 'something important'.
To enter column mode hold down alt and shift then use the arrow keys to select and controlC controlV to copy and paste..
$count = preg_match_all('/regex search pattern/' , variable name of the haystack to be searched, variable name of the array of matches);
The function returns an integer value equal to the number of matches it has found or 'false' if your regex pattern or syntax is no good. The line above puts that count into the variable $count. All php variables start with a dollar and
then a letter.
You have to prefix the regex pattern with '/ and suffix it with /'. That is the format for php.
Example: $count = preg_match_all($search, $linkcontents, $matches, PREG_PATTERN_ORDER);
This counts all of the matches in the text $linkcontents of the regex pattern with variable name $search and puts said matches into the 1D array called $matches[0], which is using the nomenclature of a 2D array for a 1D
array..
The 1D array called $matches[1] has all the first bracketted capture strings of the matches.
The 1D array called $matches[2] has all the second bracketted capture strings of the matches. So that...
$match[0][0] is the first match, $match[0][1] is the 2nd match, $match[0][2] is the 3rd match etc.
$match[1][0] is the first bracketted capture string in the first match, $match[1][1] is the 1st bracketted capture string in the 2nd match etc.
$match[2][0] is the 2nd bracketted capture string in the first match, $match[2][1] is the 2nd bracketted capture string in the 2nd match etc.
So in general $match[m][n] is the mth bracketted capture string in the n+1th match. And here we see in its full glory the satanic brain twisting ruse of indexing the first element of an array as zero rather than 1. So array with 5 elements is labelled from 0 to 4. Now try that in 2D!
$count = preg_match('/regex search pattern/' , variable name of the haystack to be searched, variable name of the array of matches);
$count will be 0 if there is no match and 1 if a match is found and 'false' if your regex pattern or syntax is not good. If you use $match for your array of matches then: $match[0] is the first (and only) matched string. $match[1] is the first bracketted capture string in your regex search pattern, $match[2] is the 2nd bracketted capture string in your pattern etc.
$text = preg_replace('/regex search pattern/', replacement string, $text);
This replaces all incidences of the regex search pattern with the replacement string in $text. If you want to record the count of the number of replacements use:
$text = preg_replace('/regex search pattern/', replacement string, $text, -1, $count);
The -1 is php's shorthand for inifinity. It tells the function to keep replacing no matter how many incidences of the pattern it finds. Or if you use a positive number, then it will only replace that number of patterns. $count is the number of replacements it has made.
$count = substr_count(haystack, needle);
This function delivers the number of times the needle appears in the haystack. It is very useful.
Example: if (substr_count($cellp[$m], ">3mp<") > 0) {$fix = $fix + 1; }
This counts the number of >3pm< strings in the 1D array $cellp[$m] and if that number is greater than 0 it adds one to $fix - which we use to cope with the disparity in subcell numbers caused by pronoun suffices being included in the parsing and excluded from the Hebrew roots in the WLC morphology.
$rootfind = str_replace(needle, replacement, haystack);
Example: $rootfind = str_replace(">", ">", $rootfind);
str_replace() is the non regex version of preg_replace(). The example above replaces every > with > (the html version of a literal > rather than > being an html tag bracket) in the variable $rootfind.
$array = explode(separator, string);
explode splits a string at the specified separator into an array of intra separator chunks. For example...
$versei = explode("</td><td>", $WLCi[$v]);
This splits the array element $WLCi[$v] at every icndicne of </td><td>, into the array $versei. The created array excludes the separators.
Example: if (substr_count($_SESSION['log'], '<br>') > 70) {$_SESSION['log'] = preg_replace('/<br>(?s)(?:(?!<br>).)*?<br>\r\n\z/', "<br>\r\n", $_SESSION['log']); }
This code deletes the last line of the log if the number of lines in the log is greater than 70. $_SESSION['log'] is a php session variable, which is a cookie which stores a text string and makes it avaialable to all php pages in the site during a session (i.e. until you close your browser). the oreg_replace looks for a string starting with the line break <br> and ending with another <br> then \r\n, which is carriage return line feed (end of lilne) and then \z, which is end of file in regex. the pattern (?s)(?:(?!<br>).)*? says 'not containing the string <br>. So we delete everything from the 2nd last <br> to the end of the file and replace it with <br>\r\n.
In both php and javascript all if statements must be enclosed in round brackets () and all actions they permit or deny in curly brackets {} as above.
array_splice(array to be spliced, element number to cut out, number of elements to take out, replacement for the cut out element);
array_splice($versef, $n, 0, 'namely'); // This replaces the $nth
element in the array $versef with 'namely'.
array_splice($versef, $n, 2, $versef[$n].'-'.$versef[$n+1]); // This reaplces the ($n)th and ($n+1)th elements in the array $versef with the combination element '$versef[$n]-$versef[$n+1].
$rootfind = substr($rootfind, 1, -1); //removes the first and last characters from the string $rootfind.
Search (\d) ([A-Z12])
Replace \1", "\2 //This puts ", " between every scripture reference in the format Psa 119:119
<script>
function str_replace(search, replace, string) {return string.split(search).join(replace); }
</script>
is a really useful javascript equivalent of the php function str_replace. Javascript tries to do everything the other way around to php, knowing that a developer will have to write in both. So it does not natively have a str_replace function. It has instead replace() function which replaces only the first occurence of the search string, unless you use regex for your search pattern, in which case it replaces all matching patterns. Except on Thursday during a full moon in which case it makes you a pinacolada instead. With php you must prefix every variable name with a $ sign. With javascript you must not. With php you join strings together using a full stop, a period. With Javascript you join them with a +. With php a for loop looks like...
for ($n = 0; $n < 100; $n++) {do stuff; }
With javascript it looks like
for (let n = 0; n < 100; n++) {do stuff; }
To define a variable in php you just get on with it and say $n = 62; or whatever. So of course you CANNOT do that with javascript. Instread you must say...
let n = 62;
const n = 62;
var n = 62;
Depending not upon the day of the week but on some pointless bureaucratic classification system which is about as useful as the data protection registrar or the national pharmaceutical regulator.
But the greatest one of them all is that if you forget to put a semi colon on the end of any php or javascript instruction, then the code will just not run, and the error message (if any) will not tell you why. That gets everyone a lot of times. That is what is called progress from the days of fortran compilers in the 70s, which actually told you what you had done wrong. For comparison in one says:
$word = str_replace("a", "b", $word) ; to replace all incidences of 'a' with 'b' in the php variable $word
let word1 = replace(/a/, "b", word); is how one makes a new variable in javascript which has all the incidences of 'a' replaced by 'b' in the variable word.
In php the you update the variable you started with. In javascript you cannot do that. You have to put the result into a new variable. In php you use a different function to str_replace for regex search and replaces called preg_replace(). But in javascript you can do a regular search or a regex pattern search in the same function. In short if php is right handed then javascript is left handed. So all webcoders have to become ambidextrous.