How to Read and Write Metadata in LLVM

29 May 2017

This is a small post to demonstrate working code for writng and reading metadata information from LLVM IR. I couldn’t find any good one stop source about how to write metadata to LLVM IR. Hence this post. Metadata API was introduced to LLVM IR in LLVM-2.7 (refer to this post). Metadata information can be used to insert any kind of data related to code without any side-effects. It can be anything from debug information to any statistical data to some useless rant about how your day job sucks. For more information about them refer to the blog post in the link above. Now let’s see some code. All of these examples are written with LLVM-3.8.

Inserting Metadata

Let’s write a function pass that inserts total number of instructions as integer to metadata of the function and order number of instruction in the function to the instruction metadata as a string (just to demonstrate how to insert string as metadata).

virtual bool runOnFunction(Function &F) {
      errs() << "I saw a function called " << F.getName() << "!\n";
      LLVMContext& C = F.getContext();
      int instructions = 0;
      if (!F.isDeclaration()) {
        for (auto I = inst_begin(F), E = inst_end(F); I != E; ++I) {
          instructions++;
          MDNode* N = MDNode::get(C, MDString::get(C, std::to_string(instructions)));
          (*I).setMetadata("stats.instNumber", N);
        }
        MDNode* temp_N = MDNode::get(C, ConstantAsMetadata::get(ConstantInt::get(C, llvm::APInt(64, instructions, false))));
        MDNode* N = MDNode::get(C, temp_N);
        F.setMetadata("stats.totalInsts", N);
      }
      return true;
    }

I think the code is pretty self explanatory. Important parts are setMetadata calls for instruction and function. It takes a string which represents the kind of the metadata and the metadata node which has the information. Here is the whole code.

Reading Metadata

Reading metadata is very straight forward. Just use getAllMetadata to grab all metadata nodes or use getMetadata(StringRef Kind) to access a specific metadata node. Here is full code.

virtual bool runOnFunction(Function &F) {
      LLVMContext& C = F.getContext();
      SmallVector<std::pair<unsigned, MDNode *>, 4> MDs;
      F.getAllMetadata(MDs);
      for (auto &MD : MDs) {
        if (MDNode *N = MD.second) {
          Constant* val = dyn_cast<ConstantAsMetadata>(dyn_cast<MDNode>(N->getOperand(0))->getOperand(0))->getValue();
          errs() << "Total instructions in function " << F.getName() << " - " << cast<ConstantInt>(val)->getSExtValue() << "\n";
        }
      }
      for(auto I = inst_begin(F), E = inst_end(F); I != E; ++I) {
        //showing different way of accessing metadata
        if (MDNode* N = (*I).getMetadata("stats.instNumber")) {
          errs() << cast<MDString>(N->getOperand(0))->getString() << "\n";
        }
      } 
      return false;
    }

Discussion, links, and tweets

Software Engineer at AWS using simulation to train robots. I am obsessed about history.