-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
added xlsx reader #11287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
added xlsx reader #11287
Conversation
aantich
commented
Nov 9, 2025
- Reads multiple spreadsheets
- Converts tables to tables
- Basic tests added
.gitignore
Outdated
| /test-docs | ||
| doc/pptx-reader-design-v2.md | ||
| doc/pptx-reader-design.md | ||
| doc/xlsx-reader-design.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the other PR, let's leave off these .gitignore changes.
|
I think the test files may need to be included in extra-source-files in pandoc.cabal. |
|
The new format should be added to MANUAL.txt ; see the list under |
|
One thing I noticed testing this locally: I got a table with hundreds of empty rows. |
- MANUAL updated - trailing empty rows removed
|
Believe we addressed all comments. Didnt test empty rows rigorously, but on a couple of quick files it works. |
| cellToInlines :: XlsxCell -> [Inline] | ||
| cellToInlines cell = | ||
| let base = case cellValue cell of | ||
| TextValue t -> [Str t] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For text values, better to use B.toList (B.text t) where B is Text.Pandoc.Builder. This will convert the string into a list of Str and Space elements. Pandoc expects spaces to be represented as Space and not space characters inside a Str.
| nativeDiff :: FilePath -> Pandoc -> Pandoc -> IO (Maybe String) | ||
| nativeDiff normPath expectedNative actualNative | ||
| | expectedNative == actualNative = return Nothing | ||
| | otherwise = Just <$> do | ||
| expected <- T.unpack <$> runIOorExplode (writeNative def expectedNative) | ||
| actual <- T.unpack <$> runIOorExplode (writeNative def actualNative) | ||
| let dash = replicate 72 '-' | ||
| let diff = getDiff (lines actual) (lines expected) | ||
| return $ '\n' : dash ++ | ||
| "\n--- " ++ normPath ++ | ||
| "\n+++ " ++ "test" ++ "\n" ++ | ||
| showDiff (1,1) diff ++ dash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this seems to be duplicated from docx reader tests, I wonder if it makes sense to import it from there, or put it in some common place, e.g. Test.Helpers ?
