Consider the following HTML snippet:
[arbitrary content]<script type="text/javascript"> window.__PRELOADED_STATE__ = {"locale": "fr-FR","env": "production" };</script><script type="text/javascript"> window.dataLayer = [{"event": "data-layer-loaded", [...] }];</script>[more arbitrary content]
I'm trying to isolate specifically the hash defined as window.__PRELOADED_STATE__
, i.e. the following content:
{"locale": "fr-FR","env": "production"}
For some reason, the following pattern does not seem to get a match:
/<script type="text\/javascript">window.__PRELOADED_STATE__ = (.*?);<\/script>/
(in this case the result would be in group 1, but regex groups are not a requirement)
Does anybody have a better suggestion? For what it's worth, the above is intended for a vanilla ruby 3.3 context (not rails). Besides, the source content contains dozens of scripts, so matching any script would not be sufficient.
Any input would be most welcome!