Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add long path support wrt #39 #40

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add long path support wrt #39 #40

wants to merge 2 commits into from

Conversation

hasufell
Copy link
Member

@hasufell hasufell commented Mar 8, 2025

No description provided.

@hasufell hasufell force-pushed the long-paths branch 12 times, most recently from a3ec45b to cb96457 Compare March 8, 2025 12:01
-- | Open a file and return the 'Handle'.
openFile :: WindowsPath -> IOMode -> IO Handle
openFile fp iomode = bracketOnError
openFile fp' iomode = (`ioeSetWsPath` fp') `modifyIOError` do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this function is going to have different semantics than the base one even with this PR included.

As far as I see base's Path -> Handle counterpant for Windows is defined at https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L853.

There still going to be extra things that base does and this package doesn't, like some sort of file access optimization, no idea what it does though: https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L875.

There's also going to be handling of locked files that is not done in this package as far as I see, though I haven't checked above openFile function maybe there's something somewhere. Locking is checked after aforementioned optimizations https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L877.

There's also some attempt at truncation if we're overwriting file https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L890.

All in all this would mean that file-io is not a drop-in replacement for what base does and someone may be able to observe the difference by switching to file-io just like with long paths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all this would mean that file-io is not a drop-in replacement for what base does and someone may be able to observe the difference by switching to file-io just like with long paths.

Honestly, I don't feel like I want this to be a strict drop-in replacement, especially since this package uses proper Win32 functionality and there is no posix emulation layer involved. Even with the native winIO manager in GHC, there may be subtle differences.

Long path support is definitely something major, so supporting it makes sense.

For other invariants, I think we'd need to find a test that demonstrates different behavior first.

Copy link
Member Author

@hasufell hasufell Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truncation is already tested here when writing over an existing file:

file-io/tests/Properties.hs

Lines 196 to 205 in e2b5ebc

existingFile2' :: Assertion
existingFile2' = do
withSystemTempDirectory "test" $ \baseDir' -> do
baseDir <- OSP.encodeFS baseDir'
let fp = baseDir </> [osp|foo|]
OSP.writeFile fp "test"
r <- try @IOException $ do
OSP.openExistingFile fp WriteMode >>= \h -> BS.hPut h "boo" >> hClose h
OSP.readFile (baseDir </> [osp|foo|])
Right "boo" @=? r

And it seems correct to me, as we set it to truncate here:

WriteMode -> Win32.tRUNCATE_EXISTING
AppendMode -> Win32.oPEN_EXISTING

rightOrError (Right a) = a

-- inlined stuff from directory package
furnishPath :: WindowsPath -> IO WindowsPath
Copy link

@sergv sergv Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a good idea for implementing furnishPath would be to reuse whatever base does via C function here https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Paths.hs.

Apart from matching semantics it would mean less copying around. The only drawback is that someone will need to check with each ghc release whether the foreign function is still named the same.

Otherwise all those conversions below to and from lists of characters somewhat undermine the cool idea of sticking with byte arrays.

isPathRegular path =
not ('/' `elem` (WS.toChar <$> path) ||
ws "." `elem` WS.splitDirectories (WS.pack path) ||
ws ".." `elem` WS.splitDirectories (WS.pack path))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change

           ws "." `elem` WS.splitDirectories (WS.pack path) ||
           ws ".." `elem` WS.splitDirectories (WS.pack path))

to

           any (\x -> x == ws ".." || x == ws ".") WS.splitDirectories (WS.pack path))

fromExtendedLengthPath ePath =
case WS.unpack ePath of
c1 : c2 : c3 : c4 : path
| (WS.toChar <$> [c1, c2, c3, c4]) == "\\\\?\\" ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to leverage short byte string's isPrefixOf here?

then simplifiedPath
else
case WS.toChar <$> simplifiedPath' of
'\\' : '?' : '?' : '\\' : _ -> simplifiedPath
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to leverage short byte string's isPrefixOf here and below?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants