On the merits of an independent data layer

My conversation with KGS BuildingsNick and Alex continued last week. I’ve had a lot of fun nerding out with them and others who have reached out in the last few weeks. I want to share a brief nugget from our conversation on the merits of an independent, open data layer.

💭 Side note: I’m thinking about recording these types of conversations for a podcast. If you like that idea, let me know below. 🙏

Here’s a quick summary of the independent data layer concept:

When designing your smart building stack, you separate the Integration and Historian Layers from the Application Layer rather than choosing one vendor’s solution for the whole stack. See my What is EMIS? essay to understand this delineation in more depth. You may also see it referred to as a data lake or middleware. I’m sure there are new acronyms for it—our industry sure loves acronyms.

The proponents of this approach tout the following primary benefit, which can sound pretty great from the building owner’s perspective:

You simply need to tag your data, put it in a data lake, and then plug in any application like fault detection and diagnostics (FDD). Then, if you don’t like the FDD, then you still own the tagged points (information model) and can just plug a different FDD vendor in.

This sounds pretty compelling, right? I know it does… because I’ve made this argument once or twice in my past life as a consultant. The argument looks something like this:

🙂 It’s a risk-free first step on the journey to a smart building: unlock and model the data that’s currently locked away in proprietary and siloed systems.

😁 It creates a single source of truth by enforcing one data model (e.g. Project Haystack or Brick Schema) for all applications and promotes interoperability.

🤩 It reduces dependence on one vendor and promotes a cooperative ecosystem. Depending on the building owner’s needs, it may be most beneficial to select multiple vendors to fulfill all the capabilities desired. If the data layer platform is designed as such, it could start to look like an app store for the building.

🥰 Similarly, it de-risks the investment by allowing the owner to trial, test, and compare multiple smart building applications without needing to restart the costly integration from scratch.

But, but, but:

Once you peek under the hood of this approach, as Alex and Nick helped me do, it might not be so pretty. Here’s their take:

While this sounds great to an owner, its simply not true. In a perfect world with perfectly understood points and metadata about those points, as well as metadata about the equipment and system interactivity (aka sequences), this would be possible. But this is not a world we live in. It’s hard to imagine living in it for quite some time.

As I unpacked this further after our conversation, it started to look worse. Here’s why:

🤔 De-risking strategies like this perpetuate the myth that these technologies aren’t quite ready for primetime. Some vendors in this space have proven their solutions in real buildings over and over again—they’re already primetime.

😏 It might actually increase risk for the owner by adding complexity, increasing the timeline, delaying results (e.g. energy savings), and involving more vendors that need to work together.

😬 It probably won’t work. If you don’t understand and plan for the applications that will use the data, you’ll struggle to model it appropriately. Today’s applications accommodate and even require vastly different types of metadata, meaning even standardized tagging is bound to fail applications that need more or need it in a different format.

☹️Complex applications like FDD are not an undifferentiated commodity. Here’s Alex and Nick again:

For the foreseeable future it is not a commodity. There are enormous differences in the complexity of information models vendors are employing and therefore the usefulness of the FDD results.

As we discussed last week, these guys know a thing or two about FDD results.

🧐 Just because your data is in a full-stack software, doesn’t mean it’s not open and usable. The best vendors (but certainly not all vendors) can provide the full stack and still serve as the data layer for other applications.

Where do you stand on this? 

If this article resonated with you, I invite you to subscribe to my weekly(ish) newsletter on smart buildings and analytics. It will keep you up to date on the industry and any new article I write.

Share or Save

7 thoughts on “On the merits of an independent data layer”

  1. Your main argument against a middleware layer is that any current tagging model won’t work for all applications. This is probably true for KGS Clockworks as it is a very robust product that requires a lot from the tagging model, but not for many/most of the other current and developing products. Also, doesn’t this argument defy the very idea of an open standard tagging model? What we need is a better tagging model, and that’s happening.
    A middleware data acquisition/trending/tagging layer might promote commodification, which is never popular with incumbents, but does promote mass adoption of technology by reducing prices. This argument reminds me of Honeywell’s early criticism of the open BACnet standard.

    1. Great points Terry! As I’ve worked my way through these arguments, it does seem like a more robust semantic model would “lift all boats”.

    2. Hi Terry, I added a comment below to better explain my point of view on the matter. I do agree with James’ points that a well defined robust model would ‘lift all boats’.

  2. Jim, Jack King here at BuildinglogiX. You might want to explore our offering for your next post. We have over 15 years in the building analytics space and millions of square feet of connected buildings utilizing our FDD applications for real time commissioning and performance enhancement. We count many world class health care providers, government agencies and universities as clients. We are a quiet company but we have a huge footprint and a very unique offering.

    Hope all is well
    Jack King

  3. I think to be successful one has to first understand the issues that are actually causing things like analytics to fail! And they are not what people typically think of.

    Analytics was created for high speed/power industries. Things that are typically in a manufacturing facility and run by a PLC industrial controls system. This compute power is not available in the typical building automation system.

    BAS networks were not designed for the level of throughput these systems need. The control power of a BAS is at the edge not remote.

    Naming Conventions are your are nemisis! This topic is exceptionally more painful for large/old organizations. The older you are and the bigger you are, the more significant this issue will be.

    Normalizing the data from these systems is typically done by hand.[slow and expensive]
    Pricing models are based on a “per point model” and this always becomes a cost issue. Forcing the owner to make a decision on what points will be gathered vs every single data point of every piece of equipment on every building.[600+ buildings 1.5m data points]

    We developed a plan (with our partners) that solves all of these issues with a solution that is:
    -Cost effective
    -Automated (mostly)
    Provides more than just pretty trends and dashboards. We need real insights from the data and use that data to drive issues to resolution.

    In this approach we separate the data collection and normalization from the Analytics and FDD. Many solutions what to sell you hardware and software to trap you in their system so that moving from one Analytics/FDD to another will force the building owner to start the process all over again or abandon the project altogether because the product failed to perform.

    Final item to consider is your cloud option needs to be could agnostic so that it move from one cloud vendor to another and/or share between. As the cloud compute option grows the building owner needs to have options and understand that if Corp signs a contract with the cloud provider that you are using for analytics/FDD you just might have to start all over again, and I can tell you that would likely kill the project

    yes this has actually been done, not just an idea or pipe dream!

    1. Tim, 
      Thanks for laying this out. I agree with the challenges you outlined. Question: how did you or are you addressing the concerns with this approach that Nick and I have laid out in the blog post and his comments? 

  4. An issue we have seen is that sites purportedly ‘follow Haystack’ but do not really represent in the tags, relationships, or taxonomy what is meaningful for accurate and actionable FDD. There is, I believe, a myth that if you sprinkle Haystack or Brick on your site that you magically have interoperability among many different applications. We’re not there yet and we need a broader industry coalition to make that happen. And to be clear, I do support the vision of an open, interoperable data model and do not intend to degrade the good work by folks engaged in Haystack or Brick. But I think we need to be honest with owners about the state of the industry.

    From the Haystack board about Haystack 4.0… “A new framework for the definition of tags and tag combinations is the core feature of Haystack 4.0. Defs are modeled as dicts just like all Haystack data.” That’s great, but, within that framework a metadata taxonomy for building systems is left for vendors to define using the available tags and data structures. If vendors define whatever they want, one vendors taxonomy, whether it is represented in a Haystack 4.0 data structure or anything else, will not necessarily be interoperable with another vendors application whether they use Haystack or not!!

    Brick is getting a little closer to specifying the taxonomy with its class, type and subclass structure, but there seems a lot still to work out and still changing, which is ok! For example, just recently equipment classes that included AHU for points were eliminated. From the Brick website… “Because there are no formal rules for how tags can be used, Haystack-based descriptions of buildings tend to consist of ad-hoc collections of tags, resulting in highly custom and inconsistent modeling practices across sites. Brick includes a tagging system similar to Haystack that augments tags with formal semantic rules that promote consistency and interopretability.” I agree, and this is the right direction and I support it. I also appreciate their adoption of semantic web standards – also great.

    To support interoperability of any vendor’s applications, we need an information model that uniquely defines the elements of building systems in a way that is commonly defined, that can be shared M2M, and that truly, fulfills real use cases. I fully support the idea of standard information models to support M2M metadata exchange for applications like FDD, MPC, ADR, and more. I think the work at ASHRAE to pull together stakeholders like Brick, Haystack, and those like me at KGS is the right effort. And let me be blunt, if you are using an argument that you must use Haystack as a smoke screen to advocate and enforce the use of one vendor product – that is proprietary, not open – with all due respect, sincerely, to my peers among the competition who do have an honest dialogue about it. To me, the open, interoperability we need is around the ontology and how to share it. The data architecture internal to an application is up the vendor.

    I also feel compelled to share that in fact we share numerous tags with the Haystack tag set (and Brick) and can consume Haystack tags to our onboarding process. We can also produce Haystack tags from our information model. And, once things settle out, we intend to share metadata in rdf/ttl as well. Our information model aligns very well with Brick, although not completely in every way.

    Amidst all this, I worry that customers are getting misled that if they pay a vendor to apply some rudimentary tags that they are all set for the future. For example, we’ve seen sites where the only tags applied to zone units were equip and hvac. Does that make them FDD or any smart application ready? No! The main point is that however you are adding metadata to a site, it should be sufficient to fulfill at least one use case or else it is wasted! And just saying it is for any use case is not enough if you aren’t clear what those use cases require.

Leave a Reply

Your email address will not be published. Required fields are marked *