XPATH syntax validator in Python


XPATH syntax validator in Python



I develop a crawler with many actions. Many xpaths are involving and for this reason I use a json file for storing. Then crawler start running I would like to make a basic syntax check (before xpath usage) on xpaths and raise error for invalid xpaths.



for example:


xpath1 = '//*[@id="react-root"]/section'
xpath2 = '//*[[@id="react-root"]/section'
xpath3 = '//*[@id="react-root"]section'



from these xpaths only xpath1 is valid



is there any module or regex which does this kind of validation?




1 Answer
1



You can compile the xpath strings with lxml.etree.XPath which will raise an exception if the syntax is incorrect:


lxml.etree.XPath


>>> import lxml.etree
>>> lxml.etree.XPath('//*[@id="react-root"]/section')
//*[@id="react-root"]/section
>>> lxml.etree.XPath('//*[[@id="react-root"]/section')
Traceback (most recent call last):
...
lxml.etree.XPathSyntaxError: Invalid expression
>>> lxml.etree.XPath(r'//*[@id="react-root"]section')
Traceback (most recent call last):
...
lxml.etree.XPathSyntaxError: Invalid expression





that's exactly what I was looking for. Thank you!
– Simakis Panagiotis
May 3 at 10:12






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

How to make file upload 'Required' in Contact Form 7?

Rothschild family

amazon EC2 - How to make wp-config.php to writable?