The weird world of Windows file paths

softfalcon · on April 21, 2023

After years of working exclusively on Windows, I took a job that required me to build file management, except now, on macOS and Linux (along with Windows).

All I can say is, this article is the tip of the ice berg on Windows I/O weirdness. You really don't realize how strange it is until you are actively comparing it to an equivalent implementation on the two other competing operating systems day-to-day.

Each of them has their quirks, but I think Windows takes the cake for "out there" hacks you can do to get things running. I sometimes like to ponder what the business case behind all of them was, then I Google it and find the real reasons are wilder than most ideas I can imagine.

Fun stuff!

fsckboy · on April 21, 2023

"get-a-byte, get-a-byte, get-a-byte, byte, byte" - Dave Cutler https://retrocomputing.stackexchange.com/questions/14150/how...

>Windows I/O weirdness. You really don't realize how strange it is until you

"can I get a wut... WUT?" - Developers, Developers, Developers

====

it's just hard for people today to understand what a revolution stdin and stdout was, full 8 bit but sticking with ASCII as much as possible. There was nothing about it that limited Unices from having whatever performant I/O underneath, but it gave programmers at the terminal the ability to get a lot done right from the command line.

The web itself is an extension of stdin and stdout, and the subsequent encrusting of simple HTTP/HTML/et al standards with layer upon layer of goop that calls for binary blobs can be seen as the invasion of Cutlerian-like mentalities. It's sad that linux so successfully took over that all the people who we were happy to let use IIS and ASP and ActiveX had to come over to this side with their ideas. No idea of which is bad, but which together are incoherent.

client server FTW, bring it back

hilbert42 · on April 22, 2023

"No idea of which is bad, but which together are incoherent."

Right, my complaint about filename/path length being exceeded in my earlier post often occurs when a web page is saved by a browser (some web pages have outrageously long filenames).

Incidentally, many years ago I did a tour of Microsoft's operation in Seattle around the time Microsoft introduced subdirectories into MSDOS and the tour guide (can't recall his name but he was responsible for the development of MS's Flight Simulator) gave a considerable spiel about why Microsoft decided to run with the backslash instead of the forward one as per Unix. Even then, I thought 'oh no, here comes confusion', and others with me thought the same. When we challenged him about it, he said we (Microsoft), want to clearly differentiate ourselves from Unix (there was an arrogance about his answer that I well remember).

fsckboy · on April 22, 2023

Flight Simulator was developed outside Microsoft, but within Microsoft you could only be referring to Alan Boyd. I received a similar tour.

I'm sure you heard what you thought you heard, arrogance and all, but iirc it was backslash because IBM insisted the slash be the switch character. Anybody remember $SWITCHAR

hilbert42 · on April 23, 2023

I know Flight Simulator was originally developed outside MS and I think my tour wasn't that long after MS acquired it.

It's too long ago for me to associate the name 'Alan Boyd' with the person in question but I do remember that he had a loud, penetrating self-assured voice. (Incidentally, he spent considerable time demonstrating Flight Simulator's new features).

You're right, IBM was a large part of the discussion as back then it was the principal client for MSDOS. However, I came away from the visit with the understanding that MS was in full agreement with IBM's decision despite MS's dabblings with Unix.

I had a particular interest at the time as I had a S-100 Godbout CompuPro computer, and in addition to CP/M, I ended up putting Seattle Computer Products' DOS (SB86 from Lifeboat Associates) on it which meant that I had compatibility with MSDOS.

I can understand why MS would have wanted to differentiate MSDOS and the backslash being one way, what I'm still not clear about is why IBM would have wanted to make such a distinction.

Re $SWITCHAR, very vaguely, but from my point it added confusion. Like some other commands its implementation and architecture appeared to be the result of afterthought rather than good design. I've forgotten much of that stuff.

chasil · on April 21, 2023

At least Windows paths are not as clunky as Cutler's prior operating system, VAX/VMS Version 1.00 21-AUG-1978 15:54.

  $ create/directory dm0:[foobar]
  $ set default dm0:[foobar]
  $ copy DM0:[SYSMGR]SYSTARTUP.COM example.txt
  $ dir/full

  DIRECTORY DM0:[FOOBAR]
  21-APR-23 14:42

  EXAMPLE.TXT;1 (615,2) 0./0. 21-APR-23 14:42 [1,4] [RWED,RWED,RWED,RE]

  TOTAL OF 0./0. BLOCKS IN 1. FILE

dhosek · on April 21, 2023

I rather liked VMS conventions as it was immediately obvious what was the directory part of the path and what was the filename and the device. You could create virtual devices as well, so for example, with VMS TeX, I had TEX_ROOT defined to point to the root of the TeX distribution and you would have TEX_ROOT:[INPUTS] for the input directory, TEX_ROOT:[MF] for Metafont files, TEX_ROOT:[EXE] for executables, etc. and everything was logically arranged. CLD files were another wonderful thing where you could define your CLI interface in an external file and let the OS handle argument and option parsing for you.

PaulDavisThe1st · on April 21, 2023

> I rather liked VMS conventions as it was immediately obvious what was the directory part of the path and what was the filename and the device.

1. unix doesn't expose devices in the file tree, and thank the [deity] for that

2. directory part: everything to the last /

3. filename: everything afte the last /

4. TEX_ROOT=/foo/bar; cd $TEX_ROOT/MF

Could it be any easier?

MereInterest · on April 22, 2023

> unix doesn't expose devices in the file tree

I mean, they are exposed as files (e.g. /dev/sda) within the file tree, but they aren’t exposed by a file path.

eska · on April 22, 2023

Maybe parent is talking about the fact that you don’t know whether bar in foo/bar is a file or a directory?

dhosek · on April 24, 2023

On 2 & 3, what if you have a directory? There’s no easy way to tell whether /foo/bar is a directory or a file. FOO:[BAR] must be a directory.

I have a vague notion that there might have been the ability to have a virtual device span multiple directories, but as I think about it, this seems unlikely since it would make it ambiguous where something would be created if a directory exists in both root1 and root2, so perhaps not. It’s been 23 years since I last used VMS and 30 since it was my daily driver though, so it’s hard for me to say too much.

jlg23 · on April 21, 2023

Clunky? Version control built in, distributed FS built in...

Athas · on April 22, 2023

The downside of building in stuff is that you can't easily replace it. I don't have any VMS experience, but the versioned file systems I've studied are very far from being a replacement for what I would consider version control today. They're more like the automatic writing of backup files in Emacs.

Distributed FS sounds like something with a huge design space. It's better to do that in userspace. Was Plan 9 really the first time virtual file systems could be implemented in userspace? It seems like such an obviously useful idea in retrospect.

chasil · on April 22, 2023

The diff3 utility for 3-way merge appeared after the VMS debut (1979).

https://en.wikipedia.org/wiki/Diff3

Can you really call what we see in VMS full revision control when it lacks this capability?

conductr · on April 21, 2023

I’m not familiar with inner workings, but simply moving files feels odd in Windows compared to macOS. If it’s a big lift in terms of data size or file/folder counts it’s most obvious but it feels like Windows literally copies the files into memory, then rewrites them on disk or something similar that has results in negative performance and a long running cut/copy/paste dialog box. I’ve had some of these run for hours on decent hardware (SSD, etc) for what I consider small datasets (couple GB). It’s been a major Windows gripe of mine for years now.

Meanwhile macOS appears to just change an internal link to the data that’s already written on disk. As such, it’s usually so very fast compared to Windows.

WorldMaker · on April 21, 2023

Windows File Explorer does a lot of extra work to get a sense of file sizes and other metadata to try to keep the UI looking fresh/interesting/useful to someone watching the job in real time.

If you need to seriously move/copy lots of files or lots of data in Windows it is generally a good idea to use shell commands. Robocopy [1], especially, is one of the strongest tools you can learn on Windows. (It gets very close to being Windows' native rsync.)

[1] https://learn.microsoft.com/en-us/windows-server/administrat...

Strom · on April 21, 2023

Windows does literally copy (parts of) files into memory. More precisely it's Windows Defender Real-Time Protection. It's a real menace when you're dealing with a lot of small files, e.g. node_modules.

Windows Explorer is also slow for an unknown reason.

Doing file operations through the API with Real-Time Protection turned off is several orders of magnitude faster in the case of small files. It's crazy stuff.

fencepost · on April 21, 2023

A lot of this depends on whether you're crossing devices. If you think of drive letters as mount points it may make more sense - if you're moving between mountable filesystems obviously a move has to be a copy-then-delete; if you're remaining on the same filesystem a move can typically be a rewriting of indexing information only with very limited data rewriting.

One other thing that can be an issue particularly on NTFS with ACLs is that moving files typically retains their ownership and permissions, while copying files typically inherits the ownership and permissions of the destination. This can bite you if as an administrator you're moving data from one user's account to another because a move will leave the original owner still owning the files.

armarr · on April 21, 2023

That's likely to be due to the NTFS file system. Another piece of legacy Windows drags along

pixl97 · on April 21, 2023

Eh, with moving files on Windows in the same security context it is 'generally' pretty fast on the same drive.... Are you sure you didn't paste in a directory that is setting new security permissions on all files?

ripley12 · on April 21, 2023

Been a while since I looked at the details, but Explorer's file management is generally slow compared to what you can do with the actual Win32 APIs.

rightbyte · on April 21, 2023

> All I can say is, this article is the tip of the ice berg on Windows I/O weirdness. You really don't realize how strange it is until

In one way it is beautiful. "Laid cards lies", you know. Don't mess with the user for your conception of Agile Clean Extreme Code (tm). Each stupid design decision is forever.

Windows .bat files win the awfulness and quirkyness contest with sh with a razor thin margin. And both are awesome.

Kwpolska · on April 21, 2023

Windows has PowerShell these days, and it is actually usable and not at all awful (if a little quirky) compared to bat files.

levidos · on April 21, 2023

PowerShell is an absolutely amazing scripting language; it gets things done way quicker than Bash, because it's object oriented, and you don't have to call external tools to get anything done (sed, grep, find, touch, curl, etc.). It can even run raw C# code for you.

xp84 · on April 21, 2023

This definitely falls into the category for me of "things that I wasn't there for."

Because I learned computers when DOS was a thing, I will always be able to write a .bat or use CMD when necessary, but having been on the UNIX/Linux side since 2003, I didn't learn C# or PowerShell but rather bash, php, ruby. So while I'm friendly with modern Windows now that they closed up the "is it a stable OS" gap with Apple, I don't really know what to do in PowerShell and am more likely to use WSL!

pjmlp · on April 22, 2023

PowerShell is the closest of Xerox PARC REPL experience that ships in the box on modern platforms.

Because not only it is a proper programming language, it is integrated into .NET, COM/DLLs as well, so not only you can script the OS, any application automation library is exposed as well.

Nowadays, it is possible to automate anything on Windows via PowerShell, the same OS APIs exposed by GUIs are also accessible to PowerShell.

On UNIX side there are things like Fish shell that also offer these capabilities, but they aren't widely adopted as PowerShell on Windows.

tremon · on April 24, 2023

Sorry, but Powershell is not a programming language. It has way too many quirks and gotchas to qualify as a programming language. It is an interactive scripting language first, and a scripted language second. But a programming language it is not.

To give just one example: its automatic boxing and unboxing of arrays disqualifies it as a programming language. Try to return a one-element array from a Powershell function and you'll see what I mean.

WorldMaker · on April 21, 2023

It's worth learning at any age, especially now that it is an open-source, cross-platform shell. The PS Koans [1] that recently showed up on HN seemed an interesting way to try to learn it.

[1] https://github.com/vexx32/PSKoans

nly · on April 22, 2023

I just want a shell that runs my commands, I don't want yet another language.

The _beauty_ of bash is you can learn the basics of the language very easily, call out to external tools, and _take that knowledge with you_. Those tools exist independently.

0x445442 · on April 22, 2023

I’ve tried to switch to Powershell a few different times and I always find it to occupy this no man’s land between a quality shell and a quality scripting language. As a shell I find it inferior to BASH and as a scripting language I find it inferior to Python.

2devnull · on April 21, 2023

“Powershell(tm): Not Entirely Awful (if a little quirky)!”

Seems worth investing a lot time into given Microsoft’s history of not rug pulling developers.

connicpu · on April 21, 2023

The first release was 16 years ago and they're still making new releases of it so I'd say it's definitely here to stay ;)

WorldMaker · on April 21, 2023

Also it is MIT-licensed open source today: https://github.com/PowerShell/PowerShell

thrixton · on April 22, 2023

And cross platform: https://learn.microsoft.com/en-us/powershell/scripting/insta...

Not that it’s a particularly compelling feature on Linux with the standard offering, but it’s a good option for cross platform scripts at times, particularly running in docker.

MereInterest · on April 22, 2023

I mean, if I want something that can run on as many platforms as possible without a prior installation, I stick as closely as I can to posix sh. If I want something more flexible that can run consistently, but may require an installation beforehand, I use python. I don’t really see what niche PowerShell would fill for me.

WorldMaker · on April 22, 2023

It doesn't have to fill a niche for you. Before cross-platform PowerShell I certainly used Python for some of those kinds of scripts.

I think a lot of it gets down to ergonomics/aesthetics to decide if you find a useful niche for PowerShell for yourself. Python's os module is powerful and lets you run/chain almost any native commands and shell operations you want to spawn, but it is still a very different API and abstraction with different ergonomics and aesthetics than shell-style pipes and redirects.

PowerShell gives you that focus on shell-like pipes/redirects, but then gives you some Python-like power on top of that to also work with the outputs of some commands as objects in a scripting environment. There's a lot of interesting value to comparing/contrasting PowerShell and Python and if you are happy with Python maybe there isn't a big reason to learn PowerShell. PowerShell is there for when you are doing a lot of shell-like processing pipelines and want to write them as such, but have some of that power of a language like Python behind it. It's a lot more powerful than posix sh and it is similarly but differently powerful to Python but it starts from a REPL that looks/acts more like posix sh. I don't know if you have a need for that niche yourself, but I find it useful for that.

_a_a_a_ · on April 21, 2023

A bit like finding in favour of london's old, awful slums by comparing them to the Somme.

"Well yes, kind of..."

maccard · on April 21, 2023

Powershells biggest problem is error handling. There's just no equivalent to set -euo pipefail

It ends up with manual error handling to handle the case of bat, ps1 and exe's

WorldMaker · on April 21, 2023

`$ErrorActionPreference = "Stop"` is very similar and does that for all of PS1 cmdlets. You still have to check $LastExitCode manually for BAT/EXEs, though.

`$PSNativeCommandUseErrorActionPreference = $true` is an experimental flag [1] as of PowerShell 7.3 that applies the same $ErrorActionPreference to BAT/EXEs (native commands), stopping (if $ErrorActionPreference is "Stop") on any write to stderr or any non-zero return value.

[1] https://learn.microsoft.com/en-us/powershell/module/microsof...

maccard · on April 22, 2023

Having to handle exe's and bats separately is _exactly_ the problem with $ErrorActionPreference, and why it's not suitable.

I wasn't aware of $PSNativeCommandUseErrorActionPreference though, seems like it's very new. How does that work with the helpful windows tools that decide to not return 0 on success (hello, robocopy)

WorldMaker · on April 22, 2023

The answer to that, including robocopy as the direct example used, is at the bottom of that documentation I linked on $PSNativeCommandUseErrorActionPreference: you set it to $false before calling something like robocopy and then reset it when done.

arka2147483647 · on April 21, 2023

Sadly, the powershell windows ships with is 5.1.

Why?

Because of backwards Compatibilytytytyy

WorldMaker · on April 21, 2023

`winget install Microsoft.PowerShell`

5.1 was the last "Windows-specific"/"Windows-only" PowerShell (and is still branded "Windows PowerShell" more than "PowerShell") before it went full cross-platform (and open source). It's an easy install for PowerShell 7+ and absolutely worth installing. If you are using tools like the modern Windows Terminal and VS Code they automatically pick up PowerShell 7+ installations (and switch to them as default over the bundled "Windows PowerShell"), so the above command line really is the one and only step.

pxc · on April 21, 2023

You can also install the latest PowerShell Core (the open-source, cross-platform releases we're talking about) via Scoop, which is a package manager for Windows that works even if you don't have admin rights: https://scoop.sh/#/apps?q=pwsh&s=0&d=1&o=true

maccard · on April 22, 2023

Unless I can rely it being somewhat available, it's not really feasible to use. It's a bit like writing scripts in fish because it's easily installable - nobody is going to use it.

Winget isn't bundled with windows 10 either, (but I think it is with 11), and it's not on windows server.

If I need to install a package manager _and_ a shell, I might as well just install WSL and be done with it.

WorldMaker · on April 22, 2023

Winget is auto-installed in Windows 10 by Windows Update and/or Store Update for every copy of Windows 10 with a recent enough build for more than a year or two, so long as that machine doesn't have the Store disabled or blocked. It is bundled inside the "Application Installer Platform" which is a low-level Store package that powers a lot of little things like the "double-click to install an MSIX file" experience and Windows generally keeps up to date quickly if Store updates aren't blocked.

I can't speak to your usage of Windows Server, but provisioning winget and PowerShell 7+ are standard bootstrapping steps in VM images at places I work, because those are generally assumed to be basic equipment at this point.

blibble · on April 21, 2023

powershell is still built on crap

I had it randomly throwing exceptions the other day that a path was too long

(it was only about 300 characters...)

dingdingdang · on April 21, 2023

It also adds it's own special brand of crap .. as in after trying 10x different ways (not kidding: https://social.technet.microsoft.com/wiki/contents/articles/...) of executing an external ffmpeg command over several hrs I eventually wrote a one line .bat file* and was done with it. Never again.

* for %%a in ("*.mp4") do ffmpeg -i "%%a" -vcodec libx265 -crf 26 -tune animation "%%~na.mkv"

Kwpolska · on April 22, 2023

It's very simple, you don't need any special magic to run a command.

    gci -filter "*.mp4" | foreach { ffmpeg -i $_ -vcodec libx265 -crf 26 -tune animation ($_ -replace '.mp4','.mkv') }

Files with spaces just work.

mdaniel · on April 21, 2023

The irony of commenting that under an article that cites the maximum path cannot exceed 260 characters is not lost on me

mauvehaus · on April 21, 2023

The maximum path length of the NT kernel is 3276-something UCS-2 characters. 260 is a limit imposed by the legacy Win32 interfaces IIRC. I believe the W-interfaces get you the full-fat version, it's just that they're so inconsistently used as to all but guarantee that something you need will have problems.

The user-mode stuff is kind of a mess. The kernel-mode stuff is comparatively orthogonal.

magicalhippo · on April 22, 2023

You also have to prefix[1] the path with \\?\ unless you've enabled a group policy[2] in Windows 10+.

[1]: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/...

[2]: https://learn.microsoft.com/en-us/windows/win32/fileio/namin...

noAnswer · on April 21, 2023

But the path length can be longer! They just don't care and offload the work.

blibble · on April 21, 2023

how is that ironic?

inferiorhuman · on April 21, 2023

It's like rain on your wedding day

klodolph · on April 21, 2023

I think the margin of bat vs sh is a larger margin. Batch files use gotos, the entire file is reread for each line, etc.

My main complaint about sh and Unix is that environment variables are mostly inherited from parent processes, and that can be awkward.

(And Windows at least, these days, is seeing a lot more PowerShell adoption.)

rightbyte · on April 21, 2023

> I think the margin of bat vs sh is a larger margin [...] the entire file is reread for each line

Ye agreed. Separation of data and code always was a mistake.

I wonder if this feature of bat files was like a thing once as a "best practice"? Practically, you should only append lines I guess. When I close my eyes, I can see a DOS batch file doing actual batch job processing, appending to itself and becoming a intertwined log and execution script.

hilbert42 · on April 21, 2023

"All I can say is, this article is the tip of the ice berg on Windows I/O weirdness."

Well, then, is there a more detailed summary than this one that's accessible?

This one looks very useful and I'll use it, but to make the point about more info it'd be nice to know how they differ across Windows versions.

For example, I've never been sure about path length of 260 and file length of 255. I seem to recall these were a little different in earlier versions, f/l being 254 for instance. Can anyone clear that up for me?

Incidentally, I hit the 255/260 limit regularly, it's damn nuisance when copying stops because the path is say 296 or 320, or more.

ChrisSD · on April 21, 2023

There are some APIs that have a lower limit than 260. But the limits can be bypassed using `\\?\` prefixed paths (except when using SetCurrentDirectory) or by enabling long paths https://learn.microsoft.com/en-us/windows/win32/fileio/maxim...

arka2147483647 · on April 21, 2023

Windows.h defines MAX_PATH as 260.

Many apps do something like

char path[MAX_PATH]

In that case, no amount of prefixing will help you, if random app enforces the limit.

hilbert42 · on April 21, 2023

"no amount of prefixing will help you, if random app enforces the limit"

I've noticed that, it's partially the reason for my confusion (I didn't wake up for quite a while as I put it down to the different versions of Windows I was running on various machines). Other pains are caused by apps that still don't run Unicode and crash or stop copying when they encounter a non-ASCII character.

pjmlp · on April 22, 2023

Yeah, old ones using raw Win32 calls, where developers haven't read anything beyond Petzold's book.

criddell · on April 22, 2023

So is the limit 259 characters plus a null?

hilbert42 · on April 21, 2023

Thanks for the reference, I wasn't aware of those changes in Win 10 (I run mainly Linux and have been weaning myself off Win for some years).

"...(this value is commonly 255 characters)."

I think the word 'commonly' in those notes confirms my point in that it's changed slightly over the years. Also, I recall the internal processing was once 16k and not 32k, come to think of it this may have been with the previous version of NTFS (can't remember which version of Win that was). My interest is now piqued so I'll search it out.

That we're even discussing such matters confirms the thrust of the article.

bloaf · on April 21, 2023

Explorer can resolve some http urls to network paths, e.g. for SharePoint libraries.

Programs, e.g. python scripts, can then use those paths, but only after explorer has resolved them. Before that, the path will be treated as not existing.

ChuckNorris89 · on April 21, 2023

From my experience as an EE, working with serial ports is much nicer on Windows (COM1, COM2, etc.) than on Linux where serial ports are abstracted to files like /dev/ttyACM0 and has a lot more gotchas.

PowerShell is also quite a powerful alternative to Bash/Mingw, although it came out much later.

Windows might do some things differently than UNIX-like OSs, but it does them really well.

wolletd · on April 21, 2023

Technically, COM1, COM2 etc. are filenames as well. They are just special in that they are available everywhere. That's why you are not allowed to create any file named COM1 or such.

But it's a DOS relic. Actually, Windows has a "Win32 Device Namespace" under "\\.\", which is something like /dev/ in Unixoids. COM1 is actually \\.\COM1: https://learn.microsoft.com/en-us/windows/win32/fileio/namin...

riffic · on April 21, 2023

it's a CP/M relic of i'm not mistaken

https://en.wikipedia.org/wiki/CP/M

blooalien · on April 21, 2023

Wow! There's an operating system I ain't heard tell of in a good long while. That's the very first OS I used in a professional context. Got me my first computer store job (in my late "teens") on a CP/M system.

pixl97 · on April 21, 2023

I deal with software that processes files on a Windows system... loves to break when people on other OS's subnet AUX, PRN, COM, File:Name, and tons of other unacceptable names (like 'file ').

I'm glad our new releases work on Linux and we don't have to deal with that crap in 99.99% of cases now.

throwaway09223 · on April 21, 2023

I've done quite a bit of work with serial ports on Windows, Linux and other unixes. I've also written a serial device driver.

Your comment is very confusing to me. The serial ports are abstracted to a file on Windows just like on unixes - the file is actually discussed in the above article: \COM1

Maybe you're talking about the old days where you would just outb 0x3f8? The modern interfaces are actually fairly similar.

zwieback · on April 21, 2023

0x3f8 IRQ4, 0x2f8 IRQ3 - still hardcoded in my brain 30yrs on!

blooalien · on April 21, 2023

My "burned in" code snippet is "call -151" from Apple ][+ days, to drop to the built-in 6502 (dis)assembler/debugger.

zwieback · on April 21, 2023

MONZ

I spent a lot of time reading the disassmbly listing in the back of the manual to see what happens when I jump to the monitor.

blooalien · on April 24, 2023

Remember typing in entire programs from magazines and computer manuals and saving them to cassette tape or floppy disc? That was "the good ol' days" for sure… :)

kevin_thibedeau · on April 22, 2023

There is also the persistent problem of USB serial adapters being assigned incremental numbers until they're in double digits that many tools don't let you select from their GUI. So you have to go in and manually purge those devices to get back to sane numbering.

mikub · on April 21, 2023

I just started with using serial ports on windows while doing some Raspberry Pico hobby projects. Something that I find strange is that every new device gets assigned a new comport, I mean let's say I do this for a while one day I will have a comport 100, 200 and so on. Is that right, or does it somehow reset the comports?

zwieback · on April 21, 2023

That's how it works and generally it's to the user's advantage. We often set specific parameters based on the device's serial number so getting the same COM port is nice, sometimes the devices are so simple that you cannot query its serial number.

Sometimes I'll do a "blank slate" and delete all my accumulated COM ports in Device Manager (need to enable "Show Hidden Devices").

qalmakka · on April 21, 2023

COM ports on Windows are crap nowadays due to how crappy USB to serial adapters have become. I've seen Windows reassigning different COM names to the same device every single time it was unplugged due to it "remembering" what COM port was used previously. Needless to say, that was an anti-feature if there ever was one.

dfox · on April 21, 2023

Windows tries to keep a long term identity of all of the device instances that it knows about (and in the idela world assign the same COM port numer to the same physical serial adapter). For USB this is supposed to be done by combination of VID, PID and serial number in device descriptor. But even early on there was a lot devices that had the serial number empty and thus Windows came up with some heuristics about when this method is unreliable and switches to identifying the device by its physical position on the USB bus. The whole mechanism is well intentioned, but in the end probably somewhat misguided because it is not exactly reliable and has surprising consequences even if it were reliable.

As a side note: on modern Windows NT implementations the so called "registry bloat" is non-issue (anyone who tells your otherwise is trying to sell you something), but keeping list of every device that was ever plugged into the computer in there is somewhat ridiculous.

InitialLastName · on April 21, 2023

> As a side note: on modern Windows NT implementations the so called "registry bloat" is non-issue

How modern? I manage Windows 7 (transitioning to 10) machines that are used for QC in a hardware manufacturing environment that enumerate hundreds of devices (with mostly identical VID/PID) every week. We find that if we don't clear old devices out of the registry every so often, the enumeration time slows to a crawl.

dfox · on April 21, 2023

That is kind of a niche use case ;)

In the times when it was a real issue (I would hazard a guess that that means “before XP”) the reason was that the original registry on disk format made every registry access more or less O(n) in bunch of things like the overall on disk hive size, total number of keys, number of subkeys in each of the keys along the path…

mmis1000 · on April 21, 2023

It also do this for monitor or usb/bluetooth earphones. So you end up get earphone(2), monitor(2) even you never have a second one. The only way to fix it is delete the hidden device in device monitor and rename it back in monitor/audio manager.

It's really a confusing thing to me that the script I use to change sound output and leveling suddenly didn't work after a bios/mobo software/whatever windows update and noticing the device have an appended (2).

yndoendo · on April 21, 2023

And this is why I hate Windows in an industry automation environment. Dislike having to troubleshoot why that USB NIC or Serial device being destroyed by plugging it into another port. Had to write a PowerShell script for the USB NIC issue to reapply NIC settings with a reboot.

Also, always locking an open file is repulsive. Other OSs allow for renaming an open file. Not Windows! Thumbs.db being locked because File Explorer keeps the file open / locked preventing deleting an empty folder and wastes so much time waiting for Windows to unlock the file.

You have to pay me to use Windows!

peteri · on April 21, 2023

Hmm as always Raymond Chen explains

https://devblogs.microsoft.com/oldnewthing/20041110-00/?p=37...

scoutt · on April 21, 2023

COM1 = CreateFile("COM1", ...) => Nice!

COM9 = CreateFile("COM9", ...) => Nice!

COM10 = CreateFile("\\\\.\\COM10", ...) => NOT nice!

Kwpolska · on April 21, 2023

How often do you end up with 10 COM ports?

zwieback · on April 21, 2023

We do all the time. In industrial automation COM ports are still shockingly popular, although it's usually the USB emulated variety. On a lot of our development and on some of our production tools we end up with COM20 or COM30, not because we have that many running at one time but because over time we've plugged in that many distinct devices. Nowadays most drivers will assign COM(n+1) when they see a device with a new serial number.

connicpu · on April 21, 2023

UART is available on nearly every microcontroller under the sun, and USB<->UART serial chips are super cheap, so it makes complete sense to me that'd become the defacto system for interfacing the automation controller with a computer

InitialLastName · on April 21, 2023

Even beyond that, USB is available on many microcontrollers, a USB CDC device is dead simple to implement, the drivers are baked into every modern OS, and all the software developers operating at that layer already know how to interact with text streams. Add in the ease of debugging when you can just manually send/receive ASCII to operate a device, and you've got the makings of a standard practice.

rogerbinns · on April 21, 2023

If you use USB dongles for serial adapters, then each path through USB is assigned a different COM number when you plug it in. For example if you plug into USB controller 2, port 3 which goes to a hub, and then you plug into port 2 that gets a number. Now plug the same thing into a different port and it will get another COM number.

Under the hood this is because the USB devices do not have the (optional) unique serial number (or in some cases they all get the same serial number).

https://devblogs.microsoft.com/oldnewthing/20041110-00/?p=37...

klysm · on April 21, 2023

I’ve found the assignment of com ports in windows really annoying

JohnFen · on April 21, 2023

Interesting. I very much prefer working with serial ports under Linux than Windows. It's more straightforward and easier to engage in fine control.

BrandoElFollito · on April 21, 2023

So do I, I find the addressing more consistent, too.

It used to be completely predictable when I was working with drivers on 1994 (patching the code), then less predictable when hardware for more diverse, and predictable again (or at least "always the same") with UUIDs.

It was always amateur/hobby dev or sysadmin so I may have had the wrong impression.

tinus_hn · on April 22, 2023

It’s the flip side of the ‘we bent over backwards so SimCity runs’ coin. Even though Windows hasn’t supported programs out of this era since 64bit became the standard, it’s still held back by clinging on to the legacy. Because it doesn’t dare say ‘this is too old, run it in a VM’.

eduction · on April 21, 2023

The fact these paths are considered at all "weird" just underlines how much we live in a Unix world.

Filesystem paths used to all be weird in the sense that there was more OS diversity. I'm sure some people here remember that classic MacOS paths used colon as the separator:

  Hard Drive:My Folder:My Document

VMS (designed by the same person as Windows NT by the way) had paths that looked like this (per Wikipedia):

  NODE"accountname password"::device:[directory.subdirectory]filename.type;ver

RSTS/E had [project,user] in the filename.

Multics paths:

  >dir1>dir2>dir3>filename

Apple Lisa used dashes as the separator. Etc.

jeroenhd · on April 21, 2023

As someone who grew up with Windows, I don't think these paths are that weird at all. Drive letter working directories just make sense, for example. The weirdest part is the (edit: HFS) compatibility mode (file.ext:substream).

One fun surprise is that because of codepage reasons, the Windows will use ¥ as a path separator in Japanese. In Korean, it's ₩. These characters represent U+005C, which is \ in Latin-compatible character sets.

robertoandred · on April 21, 2023

It's pretty weird that Windows drives use random letters instead of just the name of the disk.

jeroenhd · on April 21, 2023

I tend to use /dev/sda1 more than /dev/disk/by-path/pci-0000:00:17.0-ata-1.0-part1. Disk names are nice, but also often longer than 8 characters and usually not very unique.

Starting from A and iterating on through Z makes sense, for an OS that's designed for two drives at most. /dev/sda and /dev/sdb are no less arbitrary than A: and B:.

One major difference was that Unix was used on big servers and couldn't fit itself onto a single disk, so /usr had to be created. DOS and Windows never needed a second drive to boot, so they didn't need to embed their resources into the drive hierarchy.

Of course, you can mount NTFS volumes at any directory you wish since at least somewhere in the early 2000s. Very few people do it, but you can!

For example:

    $Disk = Get-Disk 2
    $Partition = Get-Partition -DiskNumber $Disk.Number
    $Partition | Add-PartitionAccessPath -AccessPath "G:\Folder01"

kemotep · on April 21, 2023

It’s a legacy from the IBM CP/CMS days.

First floppy drive was A, Second B, and when internal Hard Drives came along they defaulted to C to be compatible with computers that had at least 2 disk drives.

https://en.wikipedia.org/wiki/Drive_letter_assignment?wprov=...

YakBizzarro · on April 21, 2023

If I rembember correctly, you could use the B drive even if you have just one unit. It was useful to copy files from one disk to another, even if you didn't had an hard drive as temporary storage

layer8 · on April 21, 2023

This was ultimately inherited from IBM’s CMS system [0] (from 1968 I believe) via CP/M and DOS.

[0] https://en.wikipedia.org/wiki/CMS_file_system

rlkf · on April 21, 2023

> The weirdest part is the HPFS compatibility mode (file.ext:substream).

HPFS had extended attributes, but not substreams. You are thinking about HFS; substreams were added to NTFS to support storing resource forks on network shares used by Macs.

jeroenhd · on April 21, 2023

Oops, you're right, added an extra letter to the acronym.

eduction · on April 21, 2023

that's amazing. and point taken that Windows is probably extra weird due to its longevity, evolution, and backward compatibility.

PaulDavisThe1st · on April 21, 2023

Windows is younger than Unix, and Unix filesystem evolved has "evolved less" due to getting it right the first time, removing backward compatibility issues.

naikrovek · on April 21, 2023

that's not a compatibility thing, is it? it's just the alternate streams feature that NTFS implemented.

jeroenhd · on April 21, 2023

NTFS implemented it to be compatible with Mac. They then started using it for storing the Mark of the Web and other special system properties, but practical came much later.

latexr · on April 21, 2023

> I'm sure some people here remember that classic MacOS paths used colon as the separator

In modern macOS (previously OS X), you’ll eventually bump into those if you need to work with paths in AppleScript. You have to specify when you’re using a POSIX path so it is properly converted. Example:

    $ osascript -e 'POSIX file "/System/Applications/Mail.app/Contents/MacOS/Mail"'
    => file Macintosh HD:System:Applications:Mail.app:Contents:MacOS:Mail

lmz · on April 22, 2023

Doesn't MacOS translate "/" into ":" sometimes in save dialog boxes when you type in a filename?

latexr · on April 22, 2023

Go in the Finder and try to change a file name to have a colon. macOS will tell you it can’t do it.

Now change it to have a forward slash. macOS will happily abide.

Finally, look at that file’s path in a terminal. Where the Finder shows a forward slash, the terminal will show a colon.

Redoing the AppleScript example:

    $ osascript -e 'POSIX file "/tmp/file with : forward slash.txt"'
    => file Macintosh HD:private:tmp:file with / forward slash.txt

Karellen · on April 21, 2023

As another example, with ADFS (Advanced Disk Filing System) on the Acorn/BBC computer family, the root directory was specified with `$`, and the directory separator was `.`.

    $dir1.dir2.dir3.filename

Symbiote · on April 21, 2023

A full path on RISC OS included the filesystem:

  ADFS::IDEDisc4.$.Games.!Repton.Arctic

ADFS is the filesystem, IDEDisc4 is the disc name, $ is the root directory, Games is a subdirectory, !Repton is an application directory (since it begins with !) and Arctic is a file within the application directory, not normally referenced by users.

  Resources:$.Apps.!Edit

is the application !Edit from the built-in ROM.

bobbylarrybobby · on April 21, 2023

macOS still uses colons as the path separator, it just does a great job of hiding them from the user. If you try to open a file with a slash in its name in a shell, though, you'll need to use a colon.

dfox · on April 21, 2023

I suspect that it is the other way around and the Finder and standard dialogs (both of which use slashes as path separator when you type the path) simply shows colons in filenames as slashes.

soraminazuki · on April 21, 2023

macOS's kernel has BSD roots, so I'd be surprised if its VFS code accepts anything other than unix paths. Just a guess, but it's probably the Cocoa APIs accepting colon paths and translating it to unix paths internally.

ekimekim · on April 22, 2023

> NODE"accountname password"::device:[directory.subdirectory]filename.type;ver

Interesting. When you think about it, that doesn't look all that different from:

scheme://user:password@host:port/directory/filename.type?key=value&key=value#fragment

which is arguably the most common kind of "file path" in use today.

vaughan · on April 22, 2023

I was parsing paths into an array of dirs recently.

The `/` root dir is quirky. You can’t just do `dirPath.split(‘/‘)`. You have to handle it as a special case. Would be easier if it had a special name. Like `$/dir1/dir2`.

Or am I missing something.

m463 · on April 21, 2023

thank goodness unix has cleared this up, with paths, mountpoints, overlay filesystems, chroot, device trees, bind mounts, loopback mounts and probably a few I forgot...

(sort of amazing the original premise, and the exceptions and workarounds you gradually accumulate and take for granted)

jaclaz · on April 21, 2023

Another lesser known fact:

The volume id string (what you get with mountvol) is - at least up to Windows 10[1] a UUID version 1 according to RFC 4122, i.e. time and node based:

https://www.famkruithof.net/guid-uuid-make.html

https://www.famkruithof.net/guid-uuid-timebased.html

Since windows creates the UUID the first time it "sees" a volume, and - usually - uses the network card MAC as node, by decoding the UUID you can get the MAC address of the PC and the time the volume was seen (this can be useful for forensics, expecially with removable devices and to verify there has been no manipulation of the MountedDevices in the Registry).

[1]possibly windows 11 changed that, or at least the UUID's shown in the article are type 4

dark-star · on April 21, 2023

So many things wrong with this article. Some things that I noticed by skimming over it:

> UNC paths can also be used to access local drives in a similar way:

> \\127.0.0.1\C$\Users\Alan Wilder

> UNC paths have a peculiar way of indicating the drive letter, we must use $ instead of :.

This is actually incorrect... He's actually accessing some random share that has no real connection to a drive. Yes, sometimes (quite often), the C$ share corresponds to the C: drive's root, but this is by no means given, as one can easily either delete the C$ share, or have it pointing to somewhere else entirely

> When the current directory is accessed via a UNC path, a current drive-relative path is interpreted relative to the current root share, say \\Earth\Asia.

This is also wrong. There is no "current directory" on an UNC share (which can easily be shown by trying to open a command prompt on a UNC share, it will show an error and start you somewhere on C:\users), and the example he gives just tries to access the share "Asia" on the server "Earth"

> Less commonly used, paths specifying a drive without a backslash, e.g. E:Kreuzberg, are interpreted relative to the current directory of that drive. This really only makes sense in the context of the command line shell, which keeps track of a current working directory for each drive.

Also wrong, it's not the command line shell that keeps track of the current directories, it's the Windows kernel itself. But I agree that such a scenario is quite useless as you can never be quite sure on what CWD you are on a given drive

> For the most part, : is also banned. However, there is an exotic exception in the form of NTFS alternate data streams.

Yeah, well, surprise: the ":" is not part of the file name, it's just a separator between filename and stream name. This is like saying that "you cannot have \ characters in a file name, but in directory names it is allowed". No, it's not. It's a separator

ChrisSD · on April 21, 2023

> There is no "current directory" on an UNC share

SetCurrentDirectory allows setting the current directory to a UNC share. https://learn.microsoft.com/en-us/windows/win32/api/winbase/...

> Also wrong, it's not the command line shell that keeps track of the current directories, it's the Windows kernel itself. But I agree that such a scenario is quite useless as you can never be quite sure on what CWD you are on a given drive

Not for a long time. It's set as a special (hidden) environment variable like `=C:=C:\current\directory`. https://devblogs.microsoft.com/oldnewthing/20100506-00/?p=14...

muststopmyths · on April 21, 2023

>SetCurrentDirectory allows setting the current directory to a UNC share.

Exactly. The cmd prompt not setting UNC paths as current directory was introduced around Windows 2000 (or maybe post-XP, it's been a while) to help legacy batch files being run from a share and then getting confused by being on a UNC path instead of one beginning with a drive letter.

This was also why, when you do a pushd \\server\share cmd.exe puts you on a mapped drive instead of directly on a UNC path.

If you use the Windows native version of tcsh, for example, you can happily use UNC paths as current directories and run commands (provided they don't try to parse drive letters from their CWD)

pixl97 · on April 21, 2023

> we must use $ instead of :.

Eh, I just thought the $ in a Windows NAS share was to ensure the share was hidden from browsing. Microsoft used to have documentation on that, but seems to be missing from their site after they removed old articles.

dark-star · on April 26, 2023

that's exactly the reason, yes

banana_giraffe · on April 21, 2023

The bit about "UNC Paths" is a bit simplified. The "$" shares are administrative shares. They're created by default, you can delete or disable them (though, if you delete them, they'll be recreated on a reboot). You can also add normal users to them.

It should also be noted that while the single driver letter ones are automatically created, the "$" at the end just marks them as hidden. You can create your own hidden shares if you ever want to.

masfuerte · on April 21, 2023

You can also change the permissions on them so you don't need to be an admin to access them.

dataflow · on April 21, 2023

The (second-)worst offense I'm aware of here is that alternate data stream names can have otherwise special characters in them, like backslashes. So if you (for example) want to strip off the last path component, you technically cannot do this by just stripping everything after the last backslash.

In fact this probably isn't the worst thing - it's even worse than this. Because you first need to strip off the prefix that represent the volume (like C:\) before you can look for a colon. But the prefix can be something like \\.\C:\ or \\.\HarddiskVolume2\, or even \\?\GLOBALROOT\DosDevices\HarddiskVolume2\. Or it can be any mount point inside another volume! (Remember that feature inside Disk Management?)

Moreover you can't even assume the colon and alternate data streams are even a thing on the file system - it's an NTFS feature. So you gotta query the file system name first. And if the file system is something else with its own special syntax you don't know, then in general you can't find the file name and strip the last component at all.

All of which I think means it's impossible to figure out the prefix length without performing syscalls on the target system, and that the answer might vary if the mounts change at run time.

ChrisSD · on April 21, 2023

A stream name is somewhat more limited than that:

> All Unicode characters are legal in a streamname component except the following:

> * The characters \ / :

> * Control character 0x00.

> * A streamname MUST be no more than 255 characters in length.

>

> A zero-length streamname denotes the default stream.

https://learn.microsoft.com/en-us/openspecs/windows_protocol...

dataflow · on April 21, 2023

Oops, thanks for the correction! I must've seen this with other characters (most likely double quotes) and not realized slashes and backslashes are an exception.

Though ironically that still doesn't help you strip the last component, since it could still be a volume mount point. Like you don't want C:\mnt\..\foo to suddenly become C:\foo, just like how you don't want \\.\Server\Share1\..\Share2 to become \\.\Server\Share2, or for \\.\C:\..\HarddiskVolume1 to become \\.\HarddiskVolume1, etc.

cryptonector · on April 21, 2023

> Moreover you can't even assume the colon and alternate data streams are even a thing on the file system - it's an NTFS feature. So you gotta query the file system name first. And if the file system is something else with its own special syntax you don't know, then in general you can't find the file name and strip the last component at all.

If the :stream syntax is not FS-specific then you can parse the data stream name out statically in almost every case. Yes, you have to work out the prefix, but you can mostly do that statically too, I think:

> In fact this might not even be the worst thing - it's even worse than this because you first need to strip off the prefix that represent the volume (like C:\) before you can look for a colon. But the prefix can be something like \\.\C:\ or \\.\HarddiskVolume2\, or even \\?\GLOBALROOT\DosDevices\HarddiskVolume2\. Or it can be any mount point inside another volume! Which I think means it's impossible to figure out the prefix length without performing syscalls on the target system, and that the answer might vary if the mounts change at run time.

The prefix of `\\.\C:\Foo:Bar` is `\\.\C:` as `C:` couldn't be a file name. The prefix of `\\.\HarddiskVolume2\Foo:Bar` is `\\.\HarddiskVolume2` because the volume name ends at the backslash. The prefix of `\\?\GLOBALROOT\DosDevices\HarddiskVolume2\Foo:Bar`... can be harder to determine but it doesn't matter because clearly there is no letter drive name in sight since a letter drive name would be... a single letter, but if the volume name were a single letter then it might require using system calls to resolve it (`\\?\GLOBALROOT\DosDevices\X\Y:Z\A:B` is harder to parse because X might be the volume name, or maybe Y: might be the letter drive and X might be part of the path prefix).

dataflow · on April 21, 2023

> If the :stream syntax is not FS-specific

It is, I believe, as I alluded to in the comment.

> `\\?\GLOBALROOT\DosDevices\X\Y:Z\A:B` is harder to parse

As in, this is impossible to do statically in the general case - those names aren't guaranteed to look like that. See the note I had added about mount points. Remember C:\mnt can itself be the mount point of a volume instead of a drive letter. (Junctions present a similar problem, but at least for those, you can make an argument that they're intended to look like physical folders, and treat them similarly. With mount points, you might not have that intention - you might be just trying to go over 26 drive letters.)

arthur2e5 · on April 22, 2023

> It is, I believe, as I alluded to in the comment.

The FILE_STANDARD_INFORMATION_EX structure alludes to a common handling of alternateStream. Winbtrfs is a great resource on this, since it implements many bells and whistles from NTFS in an open way -- you just grep for a keyword and you will be close. The code exercising the Windows API for testing is src/tests /streams.cpp.

Grep on FILE_STREAM_INFORMATION in the source should provide more useful hits on the source, but phone browsers are clumsy.

andirk · on April 22, 2023

Those back slashes were annoying to me since before I knew what they were doing. Whereas the forward slash always made sense.

flangola7 · on April 21, 2023

What is an alternate data stream?

Calzifer · on April 21, 2023

Any data stream which is not the first one.

Data stream is basically the file content and on NTFS a file can have more than one. In practice it is comparable to extended attributes in the Linux world but somewhat superior.

But like extended attributes it doesn't seem to have too much real world use. The only use case for alternate data streams I can remember are the "this file was downloaded from the internet, do you really want to run it" warnings. In such cases the browser attached a standardized marker as alternate data stream to the file.

jra_samba · on April 21, 2023

Oh contraire :-) .Alternate data streams are widely used by virus writers and spies using them to exfiltrate data from foreign (to the spy) government and corporate Windows IT systems.

You think I jest ? Look up the leaked source code for the US government spy tooling. They hide data to be exfiltrated in an ADS on the root directory of the share :-).

I finally realized ADS were the mother of bad ideas when Ted Tso responded to me asking why I couldn't have them in Linux for the umpteenth time by showing me a Windows task manager screenshot of Myfile.txt as an actively running process.

If the ADS ends in .exe then Windows will happily run it :-).

nereye · on April 21, 2023

That standardized marker is also known as the 'Mark of the Web' (MOTW) in case you want to search for more details about it.

In general, there is a tool which comes with the SysInternals suite that allows you to see which files have streams and their size:

https://learn.microsoft.com/en-us/sysinternals/downloads/str...

ajcoll5 · on April 21, 2023

If you have macOS clients connecting to an SMB file share hosed by a Windows server they use alternate data streams to store resource forks - like fonts. Makes for a fun 'oh shit' moment if you go to zip up files on Windows to archive, then realize you're missing data when you later unzip as most compression applications don't keep them.

LoganDark · on April 21, 2023

> they use alternate data streams to store resource forks - like fonts

macOS stores fonts in resource forks? I'm confused, what use does this have and what happens when you accidentally miss them?

skissane · on April 23, 2023

> macOS stores fonts in resource forks? I'm confused, what use does this have and what happens when you accidentally miss them?

Classic MacOS considers fonts to be a type of resource, and hence stores them in the resource fork. Contemporary macOS fonts are just ordinary files with a data fork only. I think grandparent is talking about the 1990s, although some of those machines remained in active use through the first few years of this century.

Windows originally considered fonts to be a type of resource too – the original bitmap fonts used with Windows 1.x-3.x are stored as a resource–except unlike MacOS it embeds resources into EXE/DLL file data instead of putting them in a fork. In fact, a .FON file containing a Windows bitmap font is just an EXE with no code, only resources. Nobody really uses this any more, everything is TrueType now and TrueType uses its own file format not resources, but Windows still supports the old bitmap fonts for any legacy apps which still use them.

LoganDark · on April 25, 2023

I originally thought you meant "macOS takes random fonts, stuffs them in resource forks for other non-font files, then bad things happen if the resource forks are ever lost" which makes zero sense to me.

Anyway... so macOS fonts themselves were made of resource forks and therefore trying to transfer fonts themselves across a non-resource-fork-supporting network share will fail? As in, the resource forks were needed in order to use the font file?

skissane · on April 26, 2023

> I originally thought you meant

Not me. ajcoll5 made the statement, you expressed confusion with it, I tried to explain what (I assume) ajcoll5 meant.

> Anyway... so macOS fonts themselves were made of resource forks and therefore trying to transfer fonts themselves across a non-resource-fork-supporting network share will fail? As in, the resource forks were needed in order to use the font file?

On Classic MacOS, some files, all the actual contents is in the resource fork, and the data fork is ignored and can be empty. So you copy such a file to a filesystem which doesn't support resource forks, you can end up with an empty file.

A good example of this is executables. 68k Mac executables, all the code is stored in the resource fork (as code resources), and the data fork is ignored and can be empty. So you copy a 68k Mac executable to a forkless filesystem, you can end up with an empty file.

By contrast, PPC Classic MacOS executables, the code is in the data fork, and the resource fork only contained actual resources such as icons or strings, not the code. If you lost the resource fork, you'd still have the code of the executable. But it probably won't run without the icons/strings/etc it expected.

This was how Apple's original (1994) implementation of "fat binaries" worked. The data fork contained the PowerPC binary and the resource fork contained the 68K binary. PPC Macs would load and run the PPC code from the data fork, 68K Macs would ignore the data fork and load and run the code from the resource fork. If you only needed PPC support, you could shrink the executable by deleting all the 68K code resources from its resource fork.

The core resources of Classic MacOS were originally stored in a single file, the "System suitcase". Originally, each installed font was a separate resource in the resource fork of that file; its data fork was unused, except to store an easter egg text message. Fonts were distributed as resources in separate suitcase files, and the "Font/DA Mover" copied them from the distribution suitcases into the system suitcase. So yes, a suitcase file used to distribute a classic MacOS font, the actual font data would be in the resource fork, and the data fork could be empty. In System 7.1, Apple introduced a separate folder called "Fonts". In some MacOS versions (not sure when it was introduced, but definitely was there by System 7.0), Finder displays suitcases as if they were folders, even though they are actually resource forks.

Contemporary macOS doesn't really use any of this stuff. It supports resource forks for backward compatibility, but modern applications don't use them. The "Font Book" app can import Classic MacOS fonts (not bitmap ones, but TrueType and Type 1) from the resource fork of a suitcase file. But once imported, the fonts are stored in ordinary files (with a data fork only) on the filesytstem.

LoganDark · on April 26, 2023

> Not me.

Eh, whatever. I originally thought whoever meant. can't edit the comment now.

> On Classic MacOS, some files, all the actual contents is in the resource fork, and the data fork is ignored and can be empty. So you copy such a file to a filesystem which doesn't support resource forks, you can end up with an empty file.

Yeah, that's about what I thought. That makes sense, thank you~

dataflow · on April 21, 2023

It's used for other things too. Like modern file compression (compact /exe:lzx).

Calzifer · on April 21, 2023

Are you sure we mean the same thing?

You seem to talk about a specific command line argument of the compact command with a Windows typical (and IMO ugly) option style with '/' instead of '--' as option marker and ':' instead of '=' as option value separator.

But that would not be directly related to ADS and I cannot imagine a good use case where the compact command should use ADS.

dataflow · on April 21, 2023

Yes, look up WofCompressedData. It's the stream name ultimately used by that command.

Calzifer · on April 21, 2023

Thanks, with this keyword I found https://devblogs.microsoft.com/oldnewthing/20190618-00/?p=10...

In context of ADS the first thing I imagined was storing the compressed and uncompressed file alongside. (which is rather silly, why compress at all)

This use case is also kinda strange. Have the compressed content as ADS, the primary contend filled with 0 as sparse and fill it when needed/accessed. :/

Alupis · on April 21, 2023

And used to be used to hide malicious software back in the early days.

dataflow · on April 21, 2023

C:\foo has a default (primary) data stream; the name of that stream is empty, so it's omitted entirely when writing the name. But the file can also have C:\foo:bar on NTFS. It's a different stream that's part of the original. (Look up "NTFS ADS" or just "NTFS streams".) These are often used to store information tied to a file that shouldn't affect the file contents.

KMag · on April 21, 2023

In the late 1990s, there was a bug in MS IIS where if you requested http://example.com/page.php , it would execute the PHP script, but if you requested http://example.com/page.php: , it would give you the PHP source code. Even more than today, it was common to hard-code database connection details, including passwords, into the source code.

Calzifer · on April 21, 2023

One thing that make Windows paths wired is that Windows API, NTFS and most Windows tools have different restrictions on file paths.

NTFS would accept almost anything. The Windows API (I think of the old Win32 one) would apply most restrictions the article mentions.

But for example not the normalization part. A filename can end with a space, no problem. That lead me once to a minor bug in .NET Framework. One of the path related functions (I think it was Directory.move) did not correctly apply this normalization and could produce directories with trailing whitespace. Good luck removing/fixing those in Windows Explorer.

pixl97 · on April 21, 2023

>NTFS would accept almost anything.

So for the longest time Adobe software had random bugs where it would create a series of folders name "Application Data" repeating recursively 3000+ characters deep.

Yea, that was fun to try to delete.

proactivesvcs · on April 23, 2023

Sounds like it wasn't handling junctions correctly. I wonder what obscure/ancient code caused that.

throwaway09223 · on April 21, 2023

The bit about allowing / as a path separator is one of my favorite bits of DOS/Windows trivia. As a unix guy it's fun to give a windows person a path with the slashes wrong like "z:/foo/bar", being corrected for a unix-ism, then having it actually work!

In practice I think the biggest problem with using forward slashes on Windows is confusing programs which expect "/" to indicate program switches. The non-uniformity of shell parsing is also a big unix/win design difference.

naikrovek · on April 21, 2023

yeah almost no one knows that forward slashes have been acceptable as path separators since (I wanna say) Win95. perhaps MSDOS.

layer8 · on April 21, 2023

It doesn’t work everywhere. For example tab completion in cmd.exe doesn’t work for a path containing forward slashes (even when quoted), because forward slash is the prefix character for command-line options.

naikrovek · on April 21, 2023

right but that's a cmd.exe thing, not a Windows thing.

Windows supports it, CMD doesn't. programs that you run from a CMD prompt support other options flag syntaxes, so it's just a cmd.exe feature.

CMD.exe is its own thing with its own backwards compatibility requirements and the case could be made that cmd.exe is "Windows" as much as anything else is, so I get it.

layer8 · on April 22, 2023

The point is, you can’t just blindly use forward-slash as a file system path separator everywhere on Windows. It’s not on equal footing with backslash in that respect.

As another example, you can’t use forward slashes in the File Open dialog of Visual Studio: https://developercommunity.visualstudio.com/t/allow-forward-...

throwaway09223 · on April 22, 2023

That's incorrect. You're confusing totally separate issues by examining specific pieces of software with product specific bugs. This isn't a valid way to examine the issue: by this metric, spaces aren't supported on unix because many programs choke on them.

In fact, you can use forward slashes across the entire file API on Windows. That's the point.

layer8 · on April 22, 2023

I'm viewing this from the end user's perspective. They can use backslashes everywhere as a path separator, but they can't use forward slashes everywhere. In that sense, forward slashes are in practice a second-class citizen on Windows. The canonical path syntax is and will remain with backslashes.

zwieback · on April 21, 2023

I'd qualify that as "almost no non-programmers know". Forward slashes are so useful in languages that use \ as an escape sequence that most programmers do know this.

throwaway09223 · on April 21, 2023

It actually predates MSDOS, and I believe dates back to PCDOS 2 when support for directories was first added.

chungy · on April 21, 2023

Since MS-DOS 2.0.

ambicapter · on April 21, 2023

Can you mix the two in one filepath though? '/' and '\\'?

layer8 · on April 21, 2023

Yes. You can even write stuff like “cd foo/\bar\/baz”.

levidos · on April 21, 2023

Depends on the interpreter, batch and PowerShell accepts both.

ambicapter · on April 24, 2023

I had to double-check, but I ran into some issues at work where .Net Framework got confused if I used both separators in a path and used ".." to try to access the parent directory.

mdaniel · on April 21, 2023

> UNC paths have a peculiar way of indicating the drive letter, we must use $ instead of :.

I don't believe that's true, I am almost positive they're SMB shares, just like any other, but are created by the system, which is why "accessing drives in this way will only work if you’re logged in as an administrator."

kunwon1 · on April 21, 2023

The dollar sign indicates that the share is 'hidden' and can't be enumerated by traditional means. The C$ share is created by default and provides root level access to the system drive, and is locked down by default for this reason

you are correct that they are just SMB shares like any other. They can be removed, though many management processes across different applications assume that those shares will be present

EvanAnderson · on April 21, 2023

In UNC paths you can append “$NOCSC$” to the hostname to force the client to bypass the “Offline Files” cache. (There are probably other wild undocumented bits like this one hiding in other places in the Windows stack.)

mdaniel · on April 21, 2023

Do you happen to have a source where you learned that? I'm always interested in "teaching myself to fish"

EvanAnderson · on April 21, 2023

I don't recall. Like the other reply to you says, these get leaked in support, etc. I'll also run "strings" or even Ghidra on closed-source binaries when I'm troubleshooting issues. There's usually good fun to be had from Microsoft binaries doing that. I've discovered undocumented debugging switches, registry entries, etc.

(In version 10.0.19041.985 of cscsvc.dll in Windows 10 I'm seeing the string "If you hit this breakpoint, send debugger remote to BrianAu." Presumably that's "Brian Aust", referenced in a chat[0] re: Offline Files.)

[0] https://techcommunity.microsoft.com/t5/storage-at-microsoft/...

c0nsumer · on April 21, 2023

Lots of this stuff we just find out while working various deep MS cases, and then the info just leaks out.

QuercusMax · on April 21, 2023

I knew windows filesystem layout was super bonkers when I had to explain to fellow devs that on a 64-bit machine, you put the 32-bit libraries in SysWow64, and the 64-bit libraries in system32.

c0nsumer · on April 21, 2023

This is a great article and really illustrates just how hard Windows works to be backwards compatible.

Lots of these (eg: the COM/LPT stuff) could be dropped and wouldn't affect most people either way, but for those things depending on it, it would be a profoundly breaking change.

neerajsi · on April 22, 2023

Newest versions of Windows do let you use these names.

c0nsumer · on April 22, 2023

How new?

'echo foo > COM1' returns 'The system cannot find the file specified.' on Windows 11. (Machine doesn't have a COM1; if this wasn't being redirected to the port, it'd have gone into a file of that name.)

ChrisSD · on April 22, 2023

You can now use ".\COM1" or "COM1.txt" but not a bar COM1.

c0nsumer · on April 24, 2023

Thanks! That's good to know.

ChrisSD · on April 21, 2023

I think what's missing from this discussion is an emphasis on how layered Windows paths are.

The Win32 paths are like an emulation layer. They parse the given path and produce a kernel path. Win32 implements all the weird history you know and love as well as things like `.` and `..`. You can use the `\\?\` prefix to escape this parsing and pass paths to the kernel.

The NT kernel has paths like `\Device\HarddiskVolume2\path\to\file`. NT paths are much simpler. There are no restrictions on paths except that they can't contain empty components (notably they can contain nul). At this layer, `.` and `..` are legit filenames.

However, it's the filesystem driver ultimately says what's a valid filename and what isn't.

easton · on April 22, 2023

You know, I’ve never tried. Can you use the device paths in explorer without assigning a drive letter?

ChrisSD · on April 22, 2023

No, not really. It only supports "normal" win32 paths. You would have to mount it to a drive letter or directory.

e4m2 · on April 21, 2023

A lower-level, more security oriented look at some of the same issues: https://googleprojectzero.blogspot.com/2016/02/the-definitiv...

cryptonector · on April 21, 2023

> Say you, for whatever incomprehensible reason, need to access a file named .., which would normally be resolved to the parent directory during normalisation, you can do so via a literal device path.

Oh no. No. Windows allows files to be named `..`?!

garaetjjte · on April 21, 2023

Maybe? https://i.imgur.com/ebo4Nd8.png

But seriously: no, at least not on NTFS. This filename does have trailing space. Though it is enough to defeat Explorer, you cannot move or delete it and properties window is broken.

jaclaz · on April 21, 2023

There are even more "strange" cases, JFYI:

https://msfn.org/board/topic/131103-win_nt~bt-can-be-omitted...

rxtdc · on April 21, 2023

Preferring a minimal look (and being immature) my desktop shortcuts for "This PC" and "Recycle Bin" have been renamed with two of the many invisible characters that windows allows.

I also routinely use single extended unicode characters as root folder names and identifiers for various purposes.

Using a search programme 'Everything", it's a lot easier to find things if I use something like pilcrow symbol as the root folder for any directory dedicated to text documents, when the alternative is to wade through results for 'documents', 'text', 'reading' or any combination of those words.

For the same reason, I find I can make much more memorable associations. It helps me harness things relationally. I can preserve uncertainty and avoid the frustration and negativity of trying to make shades of grey and rose fit black and white patterns. It does sound a bit new age, but there's no doubt in my mind, flat heirachical alphanumeric patterns are restrictive, prescriptive, insufficient. For example, a lot of artists actively work to defy pidgeon holing. I still need identifiers.

I mean, even if I wasn't into 'bleeding edge' culture, restrictions, problems and frustrations are the normal experience. I think this is illustrated by the unsatisfactory experiences that people find when they try to make id3 tagging "work".

It's as close as I can get to banishing the pervasive 'what-if' heartbreak of WinFS being cancelled. Sadly it doesn't help at all make up for what 'Semantic Web' promised. But that's probably why I'm a believer in GPT and the like.

Is it just me that can't help thinking they are products that have arisen from the need to make non-semantic computing useful again?

ChrisSD · on April 21, 2023

Yes you can create files named `.` and `..`. However, any sensible filesystem driver will reject that name (spoiler: there does exist drivers that aren't sensible).

Jap2-0 · on April 21, 2023

I can't find a way to create one - so if Windows allows that, it's begrudgingly at best.

You can put `..` earlier in the file name, though.

Karellen · on April 21, 2023

> Windows allows files to be named `..`?!

Maybe not?

Under unix, if you create a symlink to a directory, e.g. `~/syslogs` is a symlink to `/var/log`, then `..` can be used to traverse the "true" parent directory. So `~/syslogs/../lib` will traverse `/var/log/..` and refer to `/var/lib`, not to `~/lib`.

However, a "normalising" path interpreter will just take something like `~/syslogs/../lib` and change it to `~/lib` without consulting the filesystem.

Given that (AIUI) Windows has supported symlinks for a while now (?), it's possible that files called `..` aren't actually allowed, but the ability to access `..` is still necessary.

(Notably, the article does point out that filenames ending in `.` are disallowed - which should exclude `..` as a name one can give a file.)

jagged-chisel · on April 21, 2023

Seems like that violates the "can't end in a period" rule

chungy · on April 21, 2023

Just using Cygwin will show that file names can indeed end in periods (and spaces). The article is very much restraining itself to the standard limitations imposed by the Win32 API, but not what the operating system actually allows. Case sensitivity has always been a thing, since Windows NT 3.1, for example; the "forbidden" characters are not so forbidden with the right file access APIs.

tragomaskhalos · on April 21, 2023

No coverage of this nonsense would be complete without also mentioning that CON, AUX and PRN and a couple of others are verboten as filenames in Win. Although apparently you can defeat this via e.g. \\?\C:\con

loudgas · on April 21, 2023

The article mentions this in the Disallowed Names section.

jeroenhd · on April 21, 2023

One of the early stupid annoying teenager programs I wrote was a tool that would spam your desktop with CON.001, CON.002, and so on through the \\?\ trick.

Windows explorer could not delete the file. You have to specify the \\?\ path to get the delete call to work, but that didn't work well with cmd.exe's `del` command.

I've since used these files to create directories that can't be deleted by automated cleanups and such, like a special folder in %TEMP% that one program needed but didn't create on its own.

kelsolaar · on April 21, 2023

The 260 characters limit has been the bane of my existence. Even though it can be disabled in the registry there are gazillions software built against old APIs that will still not work. You also get really odd bugs when that plague hits you.

zwieback · on April 21, 2023

Needs mention of auto-generated short file names (8.3 alias, typically with ~ in them) on volumes that don't support long names

Someone · on April 21, 2023

Not only there. Windows originally created them on disks that supported long names, too.

That was necessary to support the use case where an older OS tried to read the disk (could happen because the user rebooted into an old DOS, for example, or if an external disk was moved to a different computer)

https://en.wikipedia.org/wiki/8.3_filename#VFAT_and_computer...:

“VFAT, a variant of FAT with an extended directory format, was introduced in Windows 95 and Windows NT 3.5. It allowed mixed-case Unicode long filenames (LFNs) in addition to classic 8.3 names by using multiple 32-byte directory entry records for long filenames (in such a way that only one will be recognised by old 8.3 system software as a valid directory entry).

To maintain backward-compatibility with legacy applications (on DOS and Windows 3.1), on FAT and VFAT filesystems an 8.3 filename is automatically generated for every LFN, through which the file can still be renamed, deleted or opened, although the generated name (e.g. OVI3KV~N) may show little similarity to the original. On NTFS filesystems the generation of 8.3 filenames can be turned off. The 8.3 filename can be obtained using the Kernel32.dll function GetShortPathName“

zwieback · on April 21, 2023

Right. Back in the 90s I worked on a network server to allow AppleTalk clients into DOS or OS/2 based networks. The Mac users enjoyed their filename freedom but the PC clients had trouble with the super-weird 8.3 short names. You couldn't really tell what the Mac filenames were.

The other direction worked great, though, DOS filenames always worked on the Mac side of the network.