-
Notifications
You must be signed in to change notification settings - Fork 0
Splitting a list of numbers into two parts that total the same in 20 languages.
License
bicknell/split-sum
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
History:
The author had an interview with a coding component. This was
one of the problems. The author had a choice of languages, and
began to ponder which was best.
At the same time, the author was curious what ChatGPT-4 could
do in terms of generating code for other languages. The author
generated the first code by hand, and used ChatGPT to do initial
translations to other languages. About 1/3 of the ChatGPT
versions worked as-is, while 2/3rds took some minor fixing and
debugging. All ChatGPT versions were looked over manually for
optimizations. In at least two cases it chose very inefficient
methods that while working greatly slowed execution, these were
manually fixed.
The author is not fluent in all of these languages. The author
accepts patches that improve the programs using standard methods
(e.g. no "special tricks").
Problem statement:
Given a list of integers determine if the list can be split at
any point so that the sum of the items on both sides of the
split are the same. Return the left and right parts as new
lists.
Test cases:
[] -> [[], []]
[100] -> [[], []]
[99, 99] -> [[99], [99]]
[98, 1, 99] -> [[98, 1], [99]]
[99, 1, 98] -> [[99], [1, 98]]
[1, 2, 3, 0] -> [[1, 2], [3, 0]]
[1, 2, 3, 5] -> [[], []]
[1, 2, 2, 1, 0] -> [[1, 2], [2, 1, 0]]
[10, 11, 12, 16, 17] -> [[10, 11, 12], [16, 17]]
[1, 1, 1, 1, 1, 1, 6] -> [[1, 1, 1, 1, 1, 1], [6]]
[6, 1, 1, 1, 1, 1, 1] -> [[6], [1, 1, 1, 1, 1, 1]]
Note: The return [[], []] may be null/nill/error in some languages.
Software Needed:
On an OSX Sonoma (14.1 or greater) system:
- Pre-installed: Python 3, Perl 5, Ruby, TCL, ZSH, Bash
- Install Xcode for: C, C++, Objective-C, Swift
- Install Rust with rustup:
- curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Install Node.JS using installer at https://nodejs.org/en/download
- Install Go using package at https://go.dev/doc/install
- Install Java JDK using DMG at https://www.oracle.com/java/technologies/downloads/
- Use HomeBrew at https://brew.sh for:
- brew install clisp
- brew install php
- brew install postgressql
- brew install r
- brew install erlang
Note, PostgreSQL must be started and stopped as needed:
- brew services start postgresql@14
- brew services stop postgresql@14
How to run:
'make' will execute all compilation steps (same as 'make all').
'make runall' will build (if necessary) and then execute tests in all languages.
Language specific notes:
ZSH and BASH do not support functions returning values or pass
by reference. The values are printed to the screen and captured
which is standard practice.
Java requires the file name and class name to match, and the
class name to not include a '-' character, resulting in the
file being called splitSum.java.
Rust needs to be build with cargo to bring in the lazy_static
dependency. Cargo requires a particular directory structure,
so the rust source code is in src/main.rs, the executable is
output to target/release/split-sum and the Cargo.toml file is
in the main directory.
C++, Objective-C, and Swift can all basically use the same
implementation as C, but instead their OOP data structures
were used to demonstrate the differences in performance.
TCL global variables must be before any procedures, so the
cases variable has been moved to the top of the file.
ZSH and BASH are both slow enough that 100,000 runs takes too
long. The code has been modified to do 10,000 runs, and
multiply the result by 10.
Arrays:
Indexing:
Fortran R, Erlang, and ZSH both start arrays at 1 and index
from 1 to <length>. All other languages start at 0 and index
from 0 to <length - 1>.
In Fortran is it possible to start arrays at any index, a
feature not used in this code.
Fortran arrays are also "column major" unlike other
languages. The 'reshape' function is used to initialize
cases so that the text in the code continues to be row-major
like other languages.
Slices
C has no slice function, elements are copied with memcpy.
Erlang no slice function, but it does have a list:split().
Go, Rust, Python 3, PHP, C++, Objective-C, LISP, Java and
JavaScript all return a list exclusive of the end position.
E.g. Slicing [ 1 2 3 4 5 ] with a start of 1 and an end
of 3 results in [ 2 3 ].
TCL, Ruby, Swift, Perl 5, R, PostgreSQL, BASH, all return
a list inclusive of the end position. E.g. Slicing
[ 1 2 3 4 5 ] with a start of 1 and an end of 3 results
in [ 2 3 4 ].
Fortran, R, and ZSH all return a list inclusive of the end
positio, but also start arrays at 1!!!! E.g. Slicing
[ 1 2 3 4 5 ] with a start of 1 and an end of 3 results in
[ 1 2 3 ].
LOC:
Lines of Code is not a great measure of any programming language.
It is also a measure that can be easily manipulated in many
languages, for instance by using naked if statements insted of
enclosing in braces. Each of these attempted to write the code
in the style most common in that language, more or less.
Lines Filename Lanugage
58 split-sum.py Python 3
59 split-sum.lsp LISP
67 split_sum.erl Erlang
68 split-sum-create.sql SQL (Two files together.)
split-sum-query.sql
69 split-sum.js JavaScript
70 split-sum.pl Perl 5
72 split-sum.rb Ruby
73 split-sum.php PHP
78 split-sum.tcl TCL
81 split-sum.swift Swift
82 split-sum.go Go
82 splitSum.java Java
83 split-sum.zsh ZSH
86 split-sum.sh SH
88 src/main.rs Rust
93 split-sum.m Objective-C
100 split-sum.cpp C++
111 split-sum.c C
136 split-sum.f90 Fortran
Results:
The runall target runs 100,000 (10,000 for BASH and ZSH, which is
then multiplied by 10 for consistency) iterations of the testCases
function and outputs the time in seconds taken for all runs.
Times Array Slice x:y returns
Rank Seconds Language Slower Start entries from Comment
1 0.072 C 1.0 0 No slice function. Built in arrays of int.
2 0.114 Fortran 1.5 1 [x] ... [y] Built in arrays of integer.
3 0.318 Java 4.4 0 [x] ... [y-1] Built in arrays, of int.
4 0.415 Go 5.7 0 [x] ... [y-1] Built in arrays i32.
5 0.445 Rust 6.1 0 [x] ... [y-1] Box of Box of i32.
6 0.605 JavaScript 8.4 0 [x] ... [y-1] Built in arrays, of int.
7 1.538 Erlang 21.3 1 list:split() Built in lists, of int.
8 1.917 TCL 26.6 0 [x] ... [y] Built in arrays untyped.
9 1.952 C++ 27.1 0 [x] ... [y-1] vector of vector of int.
10 2.643 PHP 36.7 0 [x] ... [y-1] Built in arrays, untyped.
11 2.910 Swift 40.4 0 [x] ... [y] Built in arrays of int.
12 4.290 Objective-C 59.5 0 [x] ... [y-1] NSArray of NSArray of NSNumber.
13 4.907 Ruby 68.1 0 [x] ... [y] Built in arrays, untyped.
14 7.361 Python 3 102.2 0 [x] ... [y-1] Built in arrays, of int.
15 17.650 Perl 5 245.1 0 [x] ... [y] Built in arrays, untyped.
16 21.847 R 303.4 1 [x] ... [y] Built in lists, untyped.
17 33.493 PostgreSQL 465.1 0 [x] ... [y] Built in arrays, of int.
18 46.861 LISP 650.8 0 [x] ... [y-1] Built in lists, untyped.
19 860.000 ZSH 11944 1 [x] ... [y] Built in arrays, untyped.
20 2350.000 BASH 32638 0 [x] ... [y] Built in arrays, untyped.
The file benchmark-run.txt contains the run used to make this table.
Bugs/Known Limitations:
The code here is more for language comparison than to be fully optimized and
hardended. As a result the following shortcomings are known.
C, Go, Rust, C++, and Swift all use "integer" arrays, typically 32 bit integers
on most platforms. The total values in the splitSum function are also the same
data type, leading to an overflow possibility.
C, C++, Objective-C and Swift all pass by reference (pointers) to the splitSum
function, but do not check if these pointers are NULL/NIL before using them.
In all languages it is assumed the dataset can fit into memory.
It would be nice if the test cases were in a file and all programs read the same
input.
About
Splitting a list of numbers into two parts that total the same in 20 languages.