Diagnosing Type Errors

Hind­ley-Mil­ner type infer­ence is at once a great tri­umph of func­tional pro­gram­ming, and simul­ta­ne­ously a source of unend­ing pain. For 40 years, researchers have pro­posed increas­ingly sophis­ti­cated meth­ods for diag­nos­ing type infer­ence errors (although none have made it to pro­duc­tion, to our knowl­edge). A vari­ety of strate­gies have emerged:

Fault Localization

As a result of the global nature of HM type con­straints, where the type-checker notices an error is not usu­ally where the real fault lies. The strat­egy of fault local­iza­tion is to try and blame the “right” line of code for a type error.

Wand1986 devel­oped an algo­rithm to track the prove­nance of uni­fi­ca­tions made by the type-checker, which could then be pre­sented to the user. Wand pro­vides the intu­ition for need­ing this infor­ma­tion as a ques­tion. “Why did the checker think it was deal­ing with a list here?” Hav­ing access to the prove­nance of uni­fi­ca­tions, experts can track the ori­gins of unex­pected checker states—the ori­gin of an incon­sis­tent type is likely not intended by the devel­oper.

The usage of the raw proof tree pre­sented by Argus fol­lows the same intu­itional logic. Like Wand, we cur­rently rely on the intu­ition of expert pro­gram­mers to jus­tify the use­ful­ness of such prove­nance. A planned exten­sion of our work is to under­stand what user-friendly or expert-sys­tem inter­ac­tions are desired—for both expert and begin­ner pro­gram­mers.

Hage and Heeren2007 devel­oped the Top frame­work to sup­port cus­tomiz­able type-infer­ence. Tra­di­tional HM sys­tems solve con­straints as they go, bias­ing the com­piler to report errors towards the end of a pro­gram. The Top frame­work builds type graphs, allow­ing for the use of heuris­tics over the global type graph to aid error diag­no­sis. An exam­ple heuris­tic is a “trust fac­tor,” which sorts con­stran­its from least to most trusted when plac­ing blame. For exam­ple, types imported from the lan­guage stan­dard are con­sid­ered high-trust—such func­tions can only be used incor­rectly. Argus ben­e­fits from such a rule, and even stronger, Rust requires type dec­la­ra­tions on all func­tions, which can be trusted more than inferred types. For a thor­ough treat­ment of the heuris­tics used in Argus see sec­tion .

Described in , often­times a sub­tree fails because of many con­juncts. How­ever, one piece of infor­ma­tion may make the entire sub­tree suc­ceed. This same prin­ci­ple applies in fault local­iza­tion. Many recent sys­tems look for sets of con­straints, that if resolved, the pro­gram would type-check. Pavli­novic et al.2014 and Lon­caric et al.2016 use an SMT solver, and Zhang et al.2015 use a Bayesian analy­sis. Sei­del et al.2017 go fur­ther and use machine learn­ing to pre­dict blame based on a train­ing set of ill-typed pro­grams.

The Argus heuris­tics use a sim­i­lar approach by ana­lyz­ing con­junc­tions for cer­tain prop­er­ties. How­ever we do not look at sets of con­straints as final. The oblig­a­tions in a proof tree devi­ate from a con­straint-sat­is­fac­tion model because they can be ambigu­ous. Ambigu­ous oblig­a­tions, if resolved, could result in fur­ther required oblig­a­tions for the suc­cess of the sub­tree—addi­tional oblig­a­tions could be unsat­is­fi­able or unin­tended by the user. A con­crete exam­ple of this sit­u­a­tion was already demon­strated in Com­piler devel­op­ers have oper­ated for many years using a heuris­tic-based approach to diag­nos­tics. While fault local­iza­tion tech­niques can help with diag­nos­ing Rust trait errors, they are unlikely to gen­er­al­ize to all cases. Our expe­ri­ence with Argus cor­rob­o­rates this intu­ition as each heuris­tic per­forms dif­fer­ent on dif­fer­ent shapes of trait sys­tem. Fault local­iza­tion is use­full, not to deter­mine accu­rate blame, but to point users to likely points of fail­ure.

Interactive Debuggers

In con­trast to fault local­iza­tion, which attempts to find the root cause of an error, an alter­na­tive is to give the pro­gram­mer an inter­face into all the infor­ma­tion. Chi­til2001 designed a com­po­si­tional expla­na­tion graph of prin­ci­pal typ­ings. This use of “com­po­si­tional” requires that the expla­na­tion of a prob­lem have a tree struc­ture, where the type of each node must be uniqely deter­mined by the node’s chil­dren’s types. This graph is com­bined with a com­mand-line inter­face that asks users about the intended types of expres­sions in a process called algo­rith­mic debug­ging, intro­duced by Shapiro1982. The value of algo­rith­mic debug­ging is to bring the devel­oper in the loop to under­stand the men­tal under which they’re work.

reverse [] = []
reverse (x:xs) = reverse xs ++ x

last xs = head (reverse xs)
init = reverse . tail . reverse

rotateR xs = last xs : init xs

The algo­rith­mic debug­ging process for this pro­gram c
To help the tool place blame a series of ques­tions are posed to the user. Below is an exam­ple user-inter­ac­tion script from the above ill-typed pro­gram, with the user responses (y/n) appear­ing to the right.

Type error in: (last xs) : (init xs)

last :: [[a]] -> a
Is intended type an instance? (y/n) n

head :: [a] -> a
Is intended type an instance? (y/n) y

reverse :: [[a]] -> [a]
Is intended type an instance? (y/n) n

(++) :: [a] -> [a] -> [a]
Is intended type an instance? (y/n) y

Using this script the tool can iden­tify that the error lies within the func­tion reverse, but the inferred type of the append oper­a­tor ++ is not to be blamed.

Algo­rith­mic debug­ging is explained in great detail because it pro­vides a promis­ing direc­tion for­ward for Argus when the tool faces ambi­gu­ity. In sce­nar­ios of great ambi­gu­ity such as human input can help Argus pro­vide bet­ter feed­back.

Fur­ther­more, inter­ac­tive debug­ging in the con­text of typ­ing has been refined in mul­ti­ple ways. Tsushima and Asai2013 devel­oped a tool to work with the OCaml com­piler rather than requir­ing a “debug­ger-friendly” re-imple­men­ta­tion of type infer­ence. Sim­i­lar to some fault local­iza­tion approaches, Stuckey et al.2003 uses con­straint sat­is­fi­a­bil­ity and con­straint prove­nance as a debug­ging aid. Chen and Erwig2014 use a “guided debug­ging” approach with counter-fac­tual typ­ing. A com­pre­hen­sive set of type-change sug­ges­tions is fil­tered and pre­sented to the pro­gram­mer to find cor­rect code changes 3.

Our approach to type class debug­ging took inspi­ra­tion from many of the above tools. It dis­tin­guishes itself as being a richer 2D graph­i­cal inter­face that inte­grates into the exist­ing VSCode IDE. This inte­gra­tion is essen­tial to map errors to respec­tive source loca­tions, and does not require users to exit their devel­op­ment envi­ron­ment to run a com­mand-line tool.

The pub­lished ver­sion does not pro­vide an algo­rith­mic debug­ging inter­face to refine error sug­ges­tions nor does it use counter-fac­tual typ­ing to pro­pose code changes. An algo­rith­mic debug­ging inter­face would cer­tainly be a sim­ple exten­sion to the Argus sys­tem and rule out many “triv­ially false” oblig­a­tions with a few user ques­tions.

Code sug­ges­tions is an oft-requested fea­ture. Under­stand­ing the source of an error is great, but wouldn’t it be bet­ter to “mag­i­cally” pro­vide a solu­tion too? This is indeed a moti­va­tion behind pro­gram syn­the­sis and auto­mated repair.

Automated Repair

Stronger than the counter-fac­tual typ­ing pre­sented in the last sec­tion by Chen and Erwig2014, sev­eral sys­tems attempt to iden­tify a small change to the input pro­gram that causes it to be well-typed 20, 3, 28.
In this the­sis the pro­posed method of type class debug­ging aims to exhaust other avenues for debug­ging before reach­ing to pro­gram syn­the­sis—a sledge­ham­mer tool. A direc­tion for explo­ration would be to present users with a set of types that do sat­isfy a class bound. Indeed the Rust com­piler already attempts to do this under cer­tain cir­cum­stances. How­ever, Rust will not com­pose types for these sug­ges­tions, but fol­low­ing the oblig­a­tions of a failed proof tree pro­vides a recipe with which a tool could com­pose types to sat­isfy a given bound. This set of types is poten­tially empty if no instances are declared or are not vis­i­ble at the error loca­tion. This fea­ture would be more light­weight than full-blown syn­the­sis as only the types are sug­gested, not an instance of those types.

Domain-specific annotations.

evi­denced by the myr­iad of tech­niques to gen­er­ally diag­nose type errors, build­ing a fully generic diag­nos­tic sys­tem is a tricky task. An alter­na­tive approach is to give library authors the nec­es­sary tool­ing to inject domain-spe­cific knowl­edge into diag­nos­tics.

The Helium sub­set of Haskell intro­duced by Heeren et al.2003 took such an approach. Library authors can use a DSL to express cus­tom errors mes­sages that reflect com­mon errors known to library main­tain­ers.

Notably, most of the efforts in the Rust ecosys­tem towards address­ing trait errors also have the shape of domain-spe­cific anno­ta­tions. RFC #2397 rust-lang/rfcs/2397-do-not-rec­om­mend.md describes a #[do_not_recommend] anno­ta­tion that library authors could place on cer­tain trait imple­men­ta­tions. For instance, if a trait is imple­mented for tuples of length 32, then a library author could mark that imple­men­ta­tion to not appear in the sug­ges­tions of diag­nos­tics.

At time of pub­lish­ing, the Rust com­piler includes a nightly fea­ture allow­ing library authors to append addi­tional notes to diag­nos­tic mes­sages. One good exam­ple of its usage is in the Axum web frame­work. The pit­falls of cre­at­ing a cor­rect Handler have been dis­cussed at lenghts in this writ­ten work. A por­tion of RFC #2397 was to improve error mes­sages and here the fol­low­ing anno­ta­tion was included on the Han­dler trait:

#[cfg_attr(
  nightly_error_messages,
  rustc_on_unimplemented(
    note = "Consider using `#[axum::debug_handler]` to improve the error message"
  )
)]

This anno­ta­tion frame­work is not as expres­sive as Helium’s DSL so the authors have cre­ated a cus­tom macro that does the heavy lift­ing, axum::debug_handler. This macro expands into addi­tional code that does a sta­tic type asser­tion for all the Han­dler con­straints. This macro-based approach is con­vinc­ing and in many cases pro­vides the cur­rent most-spe­cific diag­nos­tic. The down­side is every library needs to cre­ate, and main­tain, these addi­tional debug­ging facil­i­ties. One attrac­tive fea­ture of the Helium sys­tem is that error notes are sta­t­i­cally-checked to ensure com­pat­i­bil­ity with the cur­rent func­tion types.