Data loading from XML file

I am now in the position of a user of Igor who has encountered a problem that my experience doesn't help with much...

Since my Chevy Volt is now almost a year old, I'd like to analyze my electric usage patterns. Portland General Electric (PortlandGeneral.com), if you look hard enough, has a link where you can download hourly usage covering a considerable time period. I guess to make sure that such a useful thing wasn't overly helpful :), they pack it into an XML file of their own design. A support person from PGE suggested that it was formatted for some specific third party. Certainly not for me, as I have managed to avoid learning much about XML.

So I was wondering if perhaps the worthies who inhabit IgorExchange might have some advice on how to proceed. To that end, I have attached a small sample of the data file.

Thanks for any insight.
Portland_General_Electric_Electric_60_Minute_11-01-2017_11-08-2017_20171226182519987.zip (3.67 KB)
Open the XML file to start with in a web browser. I think Chrome is the best for the job. This will enable you to inspect the file in tree form. Once you figure out which nodes you'd like to load then use the XMLutils XOP to load the data. To do that you need to know a bit about XPath, so look at a few tutorials on the web.

One of the nodes is:
<br />
<IntervalReading><br />
<timePeriod><br />
<duration>3600</duration><br />
<start>1509537600</start><br />
</timePeriod><br />
<value>742</value><br />
</IntervalReading><br />

It looks like you used 742 of something in an hour, starting at an epoch time of 1509537600.

Thanks, Andy. No doubt 742 watt-hours, or since it happened in one hour, 742 watts.

I can figure out the epoch from the dates covered by the file....
... and that large number appears to be a Posix epoch for Wednesday, November 1, 2017 12:00:00 PM GMT, which makes sense since I took a week of data starting with Nov. 1, 2017.

I think I'm on the way. It would be nice if PGE made this available in a way that didn't involve learning a new technology. Or maybe I could set myself up as a consultant who can decipher electric usage :)

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
•fid = XMLopenfile("teapot:Users:andrew:Downloads:Portland General Electric_Electric_60_Minute_11-01-2017_11-08-2017_20171226182519987.xml")

print(fid)
  74

// this gets duration/start/value, but you also get blank space from the nodes.
// that's why we're going to use sockitwavetostring, sockitstringtowave
// if you use "//ns:value" you won't need to jump through those hoops.

•xmlwavefmxpath(fid, "//ns:IntervalReading", "ns=http://naesb.org/espi&quot;, "\r\n")
string nums;
•sockitwavetostring/Txt=" " M_xmlcontent, nums
// all numbers appear to be integers
•sockitstringtowave/Tok=" " 2^7, nums
edit W_stringtowave
xmlclosefile(fid, 0)
Can't you just download the data as a .csv file? My meter can't actually provide this kind of data so I can't test this, but if I click the "Export Data" link under the controls where you set what kind of data you want, a .csv file is downloaded.
It is Green Button data. And yes, a CSV file is available but when I tried to use it, I was only able to get a week at a time. I want a year, of course. The Portland General Electric web page where you can get the data seems kind of buggy.

But now I'm actually getting involved with this as an XPath learning tool. But I'm already stuck trying to understand. Based on the list you get from XMLelemlist(), I did this:
xmlwavefmxpath(73, "/*", "", "\n")

That gets lots of output, with lots of stuff I don't care about. But it seemed like this should get the same output, based on the material at the www.w3schools.com site:
xmlwavefmxpath(73, "/feed", "", "\n")

But that gets nothing. I looked at your example, and tried this:
xmlwavefmxpath(73, "/ns:feed", "", "\n")

That gets me an error: XMLutils: Failure with XPath expression

Edit: These questions are kind of stream-of-consciousness. Might be good to start at 4...
So:
1) What is "ns"?
2) That tutorial keeps talking about "the current node". Does that mean the node that you presently are at while parsing the Xpath?
3) It seems like I should be able to use literal node names, like "field" or "value". Why doesn't that work as in my attempt above?
... some time passes ...
4) I see that your example appears to define "ns" by setting it in the second string input. So is "//ns:IntervalReading" the same as "http://naesb.org/espi:IntervalReading"?
5) I take it that the namespaces are defined by strings like xmlns="http://naesb.org/espi" which is an attribute of UsagePoint node. Do all children of UsagePoint have to be qualified by the namespace?
6) This: xmlwavefmxpath(73, "//ns:UsagePoint/ServiceCategory/kind", "ns=http://naesb.org/espi", "\r\n") gets me nothing, but
This: xmlwavefmxpath(73, "//ns:kind", "ns=http://naesb.org/espi", "\r\n")
gets me a single row "0" which is, in fact, the value of UsagePoint/ServiceCategory/kind

That's probably enough questions for now. Thanks for your help!


John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Beware - the fact that XMLutils works for a lot of stuff does not mean I'm an expert on XML. A more conversant person may or may not be horrified by the way that the XOP works.

johnweeks wrote:

That gets me an error: XMLutils: Failure with XPath expression


When you get this error this means there is a problem with the XPath expression with respect to the file. E.g. you didn't use the right namespace.

johnweeks wrote:

1) What is "ns"?


Namespace - https://en.wikipedia.org/wiki/XML_namespace

johnweeks wrote:

3) It seems like I should be able to use literal node names, like "field" or "value". Why doesn't that work as in my attempt above?

a mixture of not using the qualified namespace and a problem with the XPath expression will be the cause.

johnweeks wrote:

4) I see that your example appears to define "ns" by setting it in the second string input. So is "//ns:IntervalReading" the same as "http://naesb.org/espi:IntervalReading"?

you need to define the namespace when you want to examine specific node names. For example you could do "//weeksies:IntervalReading" with "weeksies=http://naesb.org/espi"

johnweeks wrote:

5) I take it that the namespaces are defined by strings like xmlns="http://naesb.org/espi" which is an attribute of UsagePoint node. Do all children of UsagePoint have to be qualified by the namespace?

I think so.

johnweeks wrote:

6) This: xmlwavefmxpath(73, "//ns:UsagePoint/ServiceCategory/kind", "ns=http://naesb.org/espi", "\r\n") gets me nothing, but
This: xmlwavefmxpath(73, "//ns:kind", "ns=http://naesb.org/espi", "\r\n")
gets me a singl
support@wavemetrics.com


I think you'd need to do: xmlwavefmxpath(73, "//weeksies:UsagePoint/weeksies:ServiceCategory/weeksies:kind", "weeksies=http://naesb.org/espi", "\r\n")

That XPath would get all the "kind" nodes that are direct daughters of "ServiceCategory" which are direct daughters of all the "UsagePoint" nodes. It would not match "kind" nodes that didn't follow that direct ancestry.

XPath is very powerful, but there is a learning curve. I normally have to resort to reading XPath tutorials and a bit of trial and error to get things to work as I want.

A.
Depending on your perspective, I will throw a monkey wrench or provide a viable alternative approach to this discussion. At one point, I had a working knowledge of XSL to transform XLM in to Igor Pro variables

http://www.igorexchange.com/project/udStFiLrXML

I might therefore boldly or naively say, the use of XSL could be a more useful method, if for no other reason that you can use the XSL in other ways (e.g. at the shell level) and obtain a host of other output formats (e.g. CSV, RTF, HTML ...). I _think_ that XSL is also a bit more transparent in syntax settings to parse the tree at various levels.

But then, to mirror Andy's experience, even on the best of days, I generally had to rely on trial and error as well as tutorials to determine how to load deeply-nested data.

And then I woke up one morning and found everything is now about JSON :-)

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
Not a monkey wrench, just a different path... Though I confess that the Sunk Cost bias kicks in!

I'll try your suggestions, Andy. Trial and error works for me. You use a lot of my trial and error :)

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Andy- your answers were very helpful, and informed some re-reading of the tutorials. The result is... Success!
Two calls to xmlwavefmxpath() extracted each hourly reading, and the start time of that reading. Here are the calls I used:
xmlwavefmxpath(72, "//ns:IntervalReading/ns:timePeriod/ns:start", "ns=http://naesb.org/espi&quot;, "\n")
xmlwavefmxpath(72, "//ns:IntervalReading//ns:value", "ns=http://naesb.org/espi&quot;, "\r\n")

That results in M_xmlcontent waves with *only* the desired info, which makes it easy to extract to numeric waves:
make/D/N=(DimSize(M_xmlcontent, 1))/O times
times = str2num(M_xmlcontent[0][p])
times += 2082844800     // offset from 1/1/1904 to 1/1/1970, converts from Unix epoch to Apple epoch
make/D/N=(DimSize(M_xmlcontent, 1))/O watts
watts = str2num(M_xmlcontent[0][p])

Naturally, since the time periods are all the same, I set the X scaling of my watts wave appropriately. Sure is nice that the Change Wave Scaling dialog can work in date/time!

Graph is attached. You can see where I plug in my Volt, usually about 6:00 PM when I get home from WaveMetrics. But Wednesday is squash league night and I don't plug in until around 10 or 11 PM. If I had a 240V level 2 charger, those peaks would be higher and shorter. But for my lifestyle, every night charging with the portable charger is generally enough and its hard to justify a $400+ charger and installation of a 240V circuit for it.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Nov1throughNov8.png (78.09 KB)
And, I'm sure your wondering, the next step is to write a generalized function to do this. I also want to extract weekly data over a year so that I can stack weekly traces, etc., etc.

The ultimate goal is to apply the published rates for time-of-day pricing and compare with my current constant-price plan.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Cool! Glad this worked. It seems you have a neat analysis problem on your hobby bench for the new year.

The XSL option would likely still be a better option even considering the already-invested costs to learn xpath and XMLutils when you want to work the XML also using something besides Igor Pro and/or extract the XML to some other format besides Igor Pro data.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
I'm not sure I understand what an "event" is in this context. I have a file full of information that is obscured by being XML formatted. The problem was extracting the interesting information.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
OK, I get it. As the reader reads through the XML file, and encounters various things, it can call a function with an "event". So it's a parser with callbacks.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
SAXS parsers are good if you know what you're looking for. If you don't then you sometimes have to do a few sweeps through a file. BTW XSLT transforms won't help you when you want to write an XML file (nor will parsers).
XMLutils is designed to write XML files as well, but it will slow down on enormous files.
I would just treat the XML file as an ordinary file.
It seems not so difficult to read the usefull infomation using freadline and regular expressions...