Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Windows does treat path names as just sequences of uint16_t (which is how NTFS stores them) if you use W-functions and prepend the paths with "\\?\".


oh, that's interesting. do UNC paths not have to be valid UTF-16?


"\\?\" is strange, because it looks just like a UNC path. But it actually isn't. It's actually a way for Win32 programs to request a path in the NT Object Namespace.

What's the NT Object Namespace? You can use "WinObj" from SysInternals to see it.

The NT Object Namespace uses its own special paths called NT-Native paths. A file might be "C:\hello.txt" as a Win32 path, but as an NT-Native path, it's "\??\C:\hello.txt". "\??\" isn't a prefix, or a escape or anything like that. It's a real directory sitting in the NT Object Namespace named "\??", and it's holding symbolic links to all your drive letters. For instance, on my system, "\??\C:" is a symbolic link that points to "\Device\HarddiskVolume4".

Just like Linux has the "/dev/" directory that holds devices, the NT Object Namespace has a directory named "\Device\" that holds all the devices. You can perform File IO (open files, memory map, device IO control) on these devices, just like on Linux.

"\??\" in addition to your drive letters, also happens to have a symbolic link named "GLOBALROOT" that points back to the NT-Native path "\".

Anyway, back to "\\?\". This is a special prefix that when Win32 sees it, it causes the path to be parsed differently. Many of the checks are removed, and the path is rewritten as an NT-Native path that begins with "\??\". You can even use the Win32 Path "\\?\GLOBALROOT\Device\HarddiskVolume4\" (at least on my PC) as another way to get to your C:\ drive. *Windows Explorer and File Dialogs forbid this style of path.* But 7-Zip File Manager allows it! And regular programs will accept a filename as a command line argument in that format.

Another noteworthy path in "\??\" is "\??\UNC\". It's a symbolic link to "\Device\Mup". From there, you can add on the hostname/IP address, and share name, and access a network share. So in addition to the classic UNC path "\\hostname\sharename", you can also access the share with "\\?\UNC\hostname\sharename" or "\\?\GLOBALROOT\Device\Mup\hostname\sharename".


I don't believe they do. Maybe the documentation will tell you it must be, but in practice file names with broken UTF-16 can be created.


It's the same on Unix.

On Unix the reason for this is that the kernel has no idea what codeset you're using for your strings in user-land, so filesystem-related system calls have to limit themselves to treating just a few ASCII codepoints as such (mainly NUL, `/`, and `.`).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: