November MRF Processing Notes

November 30, 2022

Hi folks! We’re back from HLTH, where the interest and excitement around the payer price transparency data was palpable. We wanted to give folks a November update on our MRF processing tips and tricks post. As always, we hope these notes help other teams working with the MRF files, so we can all bring meaningful healthcare price transparency to the US market.

Key Highlights for November:

The TL;DR for this month is that payers were responsive to feedback from MRF data processors like Serif Health, and the data is getting (slightly) easier to work with.

Compression rates for Aetna and UHC got better; network contents largely unchanged.
Cigna overhauled their file structure…not entirely for the better
Highmark posted meaningfully sized traditional network files
We confirmed how multiple rate tiers and fee schedules are represented in MRFs (as multiple different rates for the same code and provider)

‍

Compression rate improvements

One of the initial and obvious findings from November’s MRF processing runs was the optimized file sizes for UHC and Aetna. Nationwide network files, like the United P3 network, went from ~3.5 GB down to ~700 MB.

‍
When we saw the size decrease, we were immediately concerned that some data had been removed or relocated out of these network files. We’ve developed some MRF data statistics tools at Serif Health to better understand file contents and quality and to enable us to quickly compare one MRF to another. When we ran these stats scans on the files, the NPI and EIN counts barely differed from October.

It’s quite impressive that HealthSparq and the payers are able to fit networks of 250k EINs and 1.3 million providers into one 700MB file that covers 16.7k codes. Hopefully this helps with their serving bills as well.

‍

Cigna’s new file structure

For anyone who’s been working with MRF files the past four months, you know Cigna’s National PPO and OAP files are tremendous in size. Clocking in at 650 and 550 GB compressed, these two networks were the largest MRF files we regularly handled in our pipeline. The files themselves were straightforward to work with and strongly matched CMS schema, but still required special handling and coding for our team.

So, we were quite surprised when we pulled the November drop of the Cigna file to find it was only 1.65GB on disk!

We were curious what changed here; optimistically we thought maybe they just figured out provider backreferencing and better compression ratios for the JSON similar to UHC and Aetna.

This was partially true - Cigna’s file does use provider backreferencing - but unfortunately Cigna made the tradeoff to also use remote provider references that require another web request to their MRF hosting site.

For the unfamiliar, a remote provider reference URL looks like this:


{“provider_group_id”: 3050260, “location”: “https://d25kgz5rikkq4n.cloudfront.net/cost_transparency/mrf/prov_reference_file/reporting_month=2022-11/prov_allcatd_grp_id=3050260/2022-11_3050260.json?Expires=1669650493&Signature=tRw60Kw~aX6Fdp8q0BSGgJ9I30qEhNOphJLrKeqQ2tshrJ4QKbC-~psJtDDuqC6QRowqNV8iBWRqW4AyV~0FNaJu-9yrrw8tn6wWeLv0o4NgjfYEYaZsdY4Ulc-B6-UYRxeMMm8LPa8FJr8V0gunKNfSYV02xOMqOBpSYj4sALjZ5ik35x~NlGO6s1vB~nDMr5iMgienGDPzujP8tGFA-Wbn~B8c0NyYeUtlhc5VHGowTfxNF~atW1fEOzWEr~oLRkZMLrjVx1oRmY1XlteqgJZO81EV0XhtIjkY63L~j1Fi3b9xkSxFdRdok6F8fx2-PSVale2VY8AGpU815U2oKw__&Key-Pair-Id=K1NVBEPVH9LWJP”}

‍

This means the Cigna file uses substantially less disk space, but in order to process it you need to make ~25,000 HTTPS calls in order to have the data necessary to process and extract useful data from the file. You also can’t easily grep through the compressed file and know whether a provider is included.

Cigna’s also started using signed cloudfront URLs, which have a different issue - expiration times. The past couple days when we went to reprocess Cigna for some customer requests, we hit another issue - their expiration times for the remote provider references expired the morning of November 28th. Opening any of the provider reference URLs now returns a 403. Ouch.

‍

Highmark Update

One of the big missing pieces in our data inventory in October was Highmark. We’d processed Highmark Western NY and Eastern NY network files for several customers in September and October, to find the files in the index each only listed a few thousand NPIs. Most other state-specific networks that we’ve extracted average a hundred thousand providers, and Highmark is a plan that describes itself as one of the largest health plans in the country

There was an oddity in their index file in October - the location block of the reference to their PPO and Traditional plans in the MRF file repeated the description line instead of actually being a URL.

We contacted Highmark about this and the November release contained a URL in that location spot:

That TR01.zip file contains several gigs of data across thousands of individual MRF files, much more appropriate for a network of their size. If you've been concerned about the data fidelity from Highmark, try reprocessing with their newly posted network zip files (Trad or PPO network).

‍

Confirming Multiple Fee Schedules are Represented in MRFs

One of the most common questions we get from our customers is ‘why do several payers list the same code / modifier / physician group / site of service combination with multiple different prices, some which repeat?’

Here’s an example of this from Cigna in Florida:

‍

We were able to get confirmation from our customers that certain payers maintain multiple different fee schedules for providers based on:

Location of service (different from place of service!) - we have confirmation that Aetna has a different base fee schedule for a procedure done in NYC vs. rest of New York state, for example.
Credentials of the provider - in behavioral health, a MD vs. Psy.D vs. NP vs. LCSW will all have different base fee schedules. Often the rates are tiered by a percentage, such that an MD would get 100% of the fee schedule, a PsyD would get 85%, NP and LCSW would get 75%. (Note: These percentage rates and credential tiers are not standard across payers, may differ by code, and each tier can obviously be negotiated).

While these differences absolutely matter for stakeholders (a spread from $78.29 -> $149.26 is enormous!) the current CMS schema for MRFs doesn’t require these differences to be cleanly or clearly delineated - just that each differing rate is posted in the file.

We’re actively building classifiers for this price split, but payers, if you’re reading this, we’d love to see only the NPIs who can bill a given rate be associated with that rate. :)

Another way this presents itself in national network files (e.g., Aetna’s national Open Access EPO) is that payers may list numerous rates for a provider that correspond to the fee schedules in several different states. We saw one provider group that had six different rates in the Aetna’s OAEPO MRF for behavioral health codes. On further investigation, these were confirmed as rates from two different fee schedules, one for Connecticut and one for New York (with three rates in each state corresponding to the MD / PsyD / NP credential split).

‍

If you’d like to leverage our expertise in handling MRF files, get in touch!

Serif Health specializes in healthcare price intelligence and can deliver data via our interactive portal, API, or bulk delivery in easier-to-use format than the standard MRF JSON. Data transformations can be made, custom for your use case. Contact us for a free data sample and quote today.

‍