eXtensible Markup Language (XML)

XML stands for eXtensible Markup Language.

XML documents are formed as 'element trees' that branch out from the root element (in this case the root element is jobs). Note: elements can have sub elements, which can have more sub elements, etc.

Elements can have text (e.g. KFC, casual, 21)

Elements can have attributes (e.g. category="retail")

RSS (Real Simple Syndication / Really / RDF Site Summary ?? - i.e., web feed) feeds are implementations of XML, and are generally used for sharing live content (e.g. news headlines, weather updates, etc). RSS files are written in XML.

The following windows contain an XML file (jobs.xml) and a Python script that can parse this jobs.xml file.

Ensure both files are saved in the same folder location:

jobs.xml

<jobs>
  <job category="fastfood">
    <location>KFC</location>
    <conditions>part time</conditions>
    <payperhour>21</payperhour>
  </job>
  <job category="retail">
    <location>IKEA</location>
    <conditions>casual</conditions>
    <payperhour>19</payperhour>
  </job>
  <hours>
    <monday>9-5</monday>
    <tuesday>9-3</tuesday>
  </hours>
</jobs>

jobs_xml.py

import xml.etree.ElementTree as ET
root = ET.parse('jobs.xml').getroot() #root: jobs

print(root.tag) #prints 'jobs'
print(root.find("hours").find("tuesday").text) #prints '9-3'
for days in root.find("hours"):
  print(days.tag) #prints 'monday' and 'tuesday'

for job in root.findall('job'):
  print(job.get('category')) #attribute: category
  for elements in job:
    print(elements.text)
    #elements: location, conditions, payperhour