Let's do something completely different on CSA

There's a great Monty Python sketch in which architects pitch concepts for a new apartment building. One architect shows off a model of his design and proudly describes the tenants' experience as they pass through the various grand and breathtaking features of the building. He continues:

The tenants arrive in the entrance hall here, are carried along the corridor on a conveyor belt in extreme comfort and past murals depicting Mediterranean scenes, towards the rotating knives. The last 20 feet of the corridor are heavily soundproofed. The blood pours down these chutes and the mangled flesh slurps into these...[At this point, a stunned and horrified commercial real estate developer interrupts the architect. Watch it.]

You see, this architect specializes in slaughter houses; it's all he really knows. Ask him for a block of flats, and he gives you an abattoir.

I recently thought of this sketch after re-reading the report released in late June by the National Academies of Sciences, Engineering and Medicine (NAS) on the Safety Measurement System (SMS) and the Compliance, Safety, Accountability (CSA) program. FMCSA is expected to hold a public meeting next month on the report and the agency's upcoming corrective action plan on CSA and SMS.

The 12-member review panel on SMS and CSA is, formally, an offshoot of the Committee of National Statistics of the NAS Division of Behavioral and Social Sciences and Education. Three members have the word "statistics" in their titles; three others' titles include the word "biostatistics." The panel appears to have as many experts in the medical field as it does experts in transportation.

This group heavily oriented in medicine and statistics has recommended that the Federal Motor Carrier Safety Administration replace SMS with a highly complex statistical model based on a concept, called Item Response Theory (IRT), that frequently is used to assess the performance of hospitals and other health care institutions.

Surprise!

This is no real knock on the NAS panel's work; like the architect in the Monty Python sketch, they are just designing what they know. And they probably have done a masterful job, statistically speaking. Of course, we mostly have to assume that is the case; key sections of the report require college-level work in statistics to understand. Most of Chapter 4 might as well be Greek. (Some of it literally is Greek as the Greek alphabet is used to represent various statistical concepts and functions.)

To appreciate how incomprehensible the panel's recommendation is to the average person consider its attempt to simplify the difference between SMS and its recommended alternative based on IRT:

The main difference is that the [Behavior Analysis Safety Improvement Categories] methodology uses severity weights that are dependent on expert opinion and empirical observations in a less empirical and static manner, whereas the item difficulty and discrimination parameters are estimated based on a formal combination of the observed data and expert opinion through the use of priors that are updated dynamically as more data are collected.

Exactly! They read my mind.

Seriously, what the NAS panel has recommended might be better than current SMS methodology. Or it might not be. Who really knows? Indeed, even the NAS report acknowledges that its proposed model is no more transparent to the average motor carrier than SMS is.

And the NAS panel concedes that even FMCSA itself probably would not really understand this new model:

Clearly, the proposed Bayesian IRT model, which involves use of 20–30 million observations and hundreds of variables to estimate hundreds of model parameters is something that requires very specific expertise, usually found in academic statisticians who carry out research on these specific models. The sparsity of data and other aspects of the problem are likely to raise some computational complexities that would require software development. The model development costs will therefore involve contracting with a small group expert in these models. However, once developed, FMCSA staff would be very capable of maintaining the model, including refitting parameter estimates, conducting model validation exercises, and incorporating improved inputs. Finally, given that the IRT model, like SMS, is sensitive to outlying values in MCMIS, FMCSA should consider institution of edit routines to identify discrepant submissions, and imputation procedures to fill in for input values that fail the edits.

All of this might as well be magic as far as motor carriers are concerned. A carrier likely will find it even harder than now to figure out whether what it does will make it look better or worse in the eyes of FMCSA. The agency, which has said it plans to adopt an IRT model, probably finds this point appealing. If a carrier can't understand how the methodology works, it can't really challenge it.

For carriers that consider SMS to be unfair and unpredictable, the reform of CSA could end up worse than the status quo – especially if FMCSA somehow gets away with cherry-picking the NAS report and ignoring some important recommended improvements to its deficient data. FMCSA also is supposed to address the recommendations of the February 2014 Government Accountability Office report on CSA, but the degree to which FMCSA truly must comply is a judgment call for the U.S. Department of Transportation's Office of the Inspector General.

A faulty premise

In some ways, the entire CSA/SMS review was flawed from the outset because the Fixing America's Surface Transportation (FAST) Act presumes that FMCSA's monitoring of all motor carriers will be based on a statistical analysis of roadside inspection data and crashes. Lawmakers just wanted better statistics and better analysis.

But there's a fundamental problem that few acknowledge and nobody can fix: You can't analyze data you don't have.

The NAS panel doesn't fully agree with that statement; its vision for an IRT-based model is supposed to account somewhat for data insufficiency and a host of other drawbacks to the current SMS methodology. This approach might be plausible for analyzing the industry as a whole or for looking at individual carriers for which there's fairly rich data. But imputing results on a small carrier based on an incomprehensible model – at least to the carrier – and not on actual on-the-road performance is grossly unfair and would lead to safer roads only by random chance.

Inspection data obviously would drive any statistical model for monitoring carriers, and there just isn't enough of it to monitor the whole industry effectively. Consider that of the roughly 491,000 active U.S. interstate motor carriers:

More than 182,000 (37%) have no reported inspections in the past two years;
Nearly 295,000 (60%) have fewer than 3 reported inspections, meaning they can't be measured in any SMS BASIC;
About 39,000 (8%) have been inspected 20 or more times – the minimum data sufficiency threshold recommended by GAO.

And while there certainly is great concern about the number of crashes and fatalities involving large trucks, crashes remain relatively rare. Just 14% of U.S. interstate motor carriers experienced a recordable crash in the past two years, and only 5% of carriers had more than one recordable crash. Crash rates calculated based on such minuscule numbers produce meaningless results, even without addressing the fact that many crashes are not preventable.

FMCSA recognizes the credibility problem inherent with this volatility. Currently, the threshold for being measured in the Crash Indicator BASIC is two crashes, which captures a mere 5% of U.S. interstate carriers. However, FMCSA last year published proposed SMS enhancements, one of which would be increasing the Crash Indicator BASIC threshold to three crashes, which would drop the number of measured carriers to around 13,600, or about 2.8%.

It's a good thing, of course, that recordable crashes are so rare, but the dearth of data for both inspections and crashes yields extremely volatile results and essentially useless metrics for measuring most of the trucking industry. Whatever the merits of a data-based monitoring system, it is not an effective approach to monitoring all carriers.

A data-based system could be effective for the whole industry only if the number of inspections rises dramatically. In the near term, this simply will not happen. Individual states would have to conduct many more inspections than they do now, and the only way that might happen would be with a surge in funding for the Motor Carrier Safety Assistance Program (MCSAP). But funding for basic MCSAP grants has been essentially flat at around $170 million for years, and there's no sign of this figure rising. And even at static funding levels, total inspections have declined slightly in recent years due to a drop in the number of inspections reported after traffic stops.

The one potential game changer is wireless roadside inspections. Congress apparently is poised to lift its funding ban on developing an WRI program, and the Commercial Vehicle Safety Alliance recently adopted a Level VIII electronic inspection that sets the stage for wireless inspections. But we are many years away from a system that would be deployed widely throughout the country – even if groups opposed to the idea are unable to shelve it.

Another way forward

Until wireless inspections fill in the gaps, what can we do? To begin, let's discard the presumption that FMCSA must assess all carriers using a single approach. Consider the following passage from the NAS report:

There is no getting around the point that providing BASIC measures to carriers that have very infrequent inspections will result in highly variable assessments of such carriers. This is simply because not much is known about the frequency of violations for small carriers. Such high variance measures can result in mischaracterizing the nature of a carrier—the high variability could result in the carrier being given alerts more or less often than what would be warranted given its behavior. On the other hand, the industry is highly skewed, being comprised of a very large number of small carriers. If the data sufficiency standards were raised, a high percentage of the industry would be excluded from measurement by SMS and therefore monitoring by FMCSA. We believe that this issue should be further investigated.

The NAS panel succinctly summarizes perhaps the most important problem with SMS, at least for small carriers, but it then brushes off the issue with the suggestion that it's possible nothing can be done about it. The NAS panel's fundamental mindset is that if a carrier isn't measured in SMS, then FMCSA isn't monitoring it. But why does it have to be that way?

With corrective action in the areas that today skew SMS results, SMS or its replacement could be fair and effective, provided that there is sufficient data. Without sufficient data, no algorithm will be fair and effective.

Using GAO's recommended data sufficiency standard, SMS (or its successor) could monitor the 39,000 carriers that have 20 or more inspections over two years. That might be just 8% of carriers, but they represent more than 60% of all trucks on the road and more than 70% of recordable crashes.

What about the other 92% of carriers? Certainly one option would be to leave them alone at least until they have a crash or two. This idea has some merit. All the data, research, and analysis conducted to date doesn't really tell us why some drivers crash and others don't. Is it not possible that safety ultimately lies mostly with the character and attitude of the driver – something that just can't be quantified or categorized? Or an operation might simply have very low crash risk due to geography or other factors related to its operating profile. Is focusing precious resources on that carrier worth the investment? Also, isn't not crashing the most concrete measure of safety?

An absence of crashes might be a reasonable basis for presuming that a carrier is safe, but that approach would never fly politically. And it certainly is true that drivers and carriers with bad safety practices could just be lucky – until they aren't. So a crash-based monitoring system is not acceptable.

What about adopting essentially the opposite approach? Suppose FMCSA and its state partners periodically reviewed the basic safety management practices of all carriers that weren't already being monitored through a more discriminating SMS. A coalition of several industry groups floated a similar idea recently at a meeting of FMCSA's Motor Carrier Safety Advisory Committee.

Today, a motor carrier must update its registration every two years by filing an MCS-150. Why not make that update more rigorous by requiring carriers not measured in SMS to upload documentation of their compliance programs, such as driver qualification, vehicle maintenance and hours-of-service records? This type of remote audit is the principal way FMCSA's state partners conduct new entrant safety audits now. Clearly, there would be a cost associated with this process as these records would have to be reviewed. One option would be to institute a fee for the biennial update.

And we don't have to think of roadside inspections as just fodder for an algorithm. Even if a carrier doesn't qualify for SMS monitoring based on a greater data sufficiency threshold, FMCSA can still look for a pattern of serious violations and respond accordingly. We just aren't accustomed to talking about anything other than what's wrong or right with SMS. We should think outside the algorithm.

Unfortunately, there's little chance that FMCSA will do anything other than try to revamp the SMS statistical model using the same scant data it relies on today. As this takes place, there could be another parallel to Monty Python's architect sketch. Like the would-be tenants of the "apartment building," many small carriers may discover too late that a new statistical model is essentially a slaughter house for their livelihoods.

The old Monty Python show bridged unrelated sketches and bits by having someone suddenly appear on screen to declare: "And now for something completely different!" Absolutely.

Avery Vise is president of compliance support firm TransComply and a longtime analyst and editor who has covered regulation and legislation in the trucking industry for nearly 20 years. MORE COMMENTARY FROM TRANSCOMPLY

Let's do something completely different on CSA

Recent Posts

Comments

Solutions

Don't let FSMA sink you

News

OOIDA seeks ELD exemption for small carriers with flawless safety records

Ag haulers get 90-delay on ELD mandate, FMCSA to provide further ELD guidance

EPA proposes to repeal emission standards on glider kits

Failure to use ELDs won't affect CSA scores until April

DOT adds four opioids to drug testing rule

FMCSA nominee clears Senate panel

ELD exemptions sought for livestock haulers, filmmakers and others

UCR registration delayed pending finalization of fees