Detecting Intel VT-x from TLB Miss Latency

2022-02-24 | Tags: #research #virtualization

I read a paper called "Detecting hardware-assisted virtualization with inconspicuous features" to detect hardware virtualization. They propose a prime and probe technique to measure tlb miss latency to detect HV. However I couldn't get some points in the paper and tried to contact with authors for code sharing. But I couldn't get a response and some points in the algorithm was very confusing for me. So I changed the algorithm and build a simple flush & reload based technique to measure tlb miss latency. The main idea is based on extended page tables (EPT) which is a hardware support for virtualization in Intel VT-x.

The concept behind EPT is adding an additional translation unit for virtual machines. So there will be additional page walk when a tlb miss occurs. So hardware virtualization can be detected from additional page walk latency. ,

The approach can be listed like:

  • Create a target page
  • Fill TLB with random pages
  • Flush(target_page[0])
  • let tlb_cache_miss = Measure Time(target_page[0])
  • Flush(target_page[0])
  • let cache_miss = Measure Time(target_page[0])
  • return tlb_cache_miss - cache_miss

Source code can be reached from: https://github.com/m8/detect_vtx_fr/blob/master/main.cpp

Creating fake pages

First we need to create a page struct.

struct  __attribute__ ((aligned (PAGE_SIZE))) Page { char x [PAGE_SIZE] = {0}; };

After that we should create a target page to measure latencies. We will access this page on detection part.

TARGET_PAGE = new Page();

Also we need to create pages to fill TLB. These pages will help us to invalidate target page's address from TLB.

size_t i = 0;
for (i = 0; i < TLB_SET_SIZE; i++)
{
    TLB_SET.push_back(new Page());
}

Detection

The first part of the detection algorithm is flushing target cache line and invalidating TLB . There is an instruction called clflush to flush a cache line from all cache levels, in the flush function we just use it.

flush(&TARGET_PAGE->x[0]);

size_t i = 0;
for (i = 0; i < TLB_SET.size(); i++)
{
    access_memory(&TLB_SET[i]->x[128]);   
}

After that we will access target cache line and this will create both tlb miss and cache miss.

long cache_tlb_miss = measure_latency(&TARGET_PAGE->x[0]);

After that we will flush this line.

flush(&TARGET_PAGE->x[0]);

And measure the access time again. Now this create only a cache miss since we now the page is in TLB. After we can substract these two value and get TLB miss latency.

long cache_miss = measure_latency(&TARGET_PAGE->x[0]);

return cache_tlb_miss-cache_miss;

Results

x: clock cyle y: samples


This was the end of the blog post. You can reach me via email umusasadik at gmail com