Skip to content

Conversation

cx-rogerio-dalot
Copy link
Contributor

@cx-rogerio-dalot cx-rogerio-dalot commented Jul 24, 2025

Closes #

Proposed Changes

  • Removed concurrency native primitives from global variables
  • Moved the pipeline to run the scan from cmd/ to the engine itself so it's easier to orchestrate and design the pipeline of the scan.
  • Fixed race warnings and made the pipeline of the scan truly concurrent.

Checklist

  • I covered my changes with tests.
  • I Updated the documentation that is affected by my changes:
    • Change in the CLI arguments
    • Change in the configuration file

I submit this contribution under the Apache-2.0 license.

kaplanlior and others added 30 commits March 29, 2023 15:16
* fix: Improve reporting service for different outputs

* refactor: applied code review suggestions

* refactor: applied code review suggestions

* refactor: applied code review suggestions

* Update cmd/main.go

* ci: windows support fix

* ci: revert fix attempt

---------

Co-authored-by: Jossef Harush Kadouri <[email protected]>
* Refactor Confluence Plugin

* Apply PR suggestions
* ignore .vscode

* feat(confluence): filter spaces

* fix(confluence): use username and token

* chore: move httpRequest to lib
This PR if replacing #46. 

@baruchiro made a PR with a better mechanism to search for only
specified spaces.

Now on this PR I'm just fixing a small issue with that implementation
that is to replace the --confluence-spaces argument from string to
stringArray.

This will fix a issue we got when no spaces selected.
closes #34

- create docker image
- create makefile
- validate docker on pr-validation
- upload tar image to Github release
- run Kics on pr-validation
- fix Kics issues
Implement suggested concurrency logic 
 
 Also added confluence window constant to avoid duplicated requests

Added a time.Sleep since without that the last item wasn't being added
to the list of results
I accidently used `docker save` instead of `make save` on #48.
Resolves #31

### Features

- We expect the user to give us a *Personal Access Token*. This token
can be retrieved from the browser *Dev Tool*, or by authenticating a
*Discord App*. See
#31 (comment).
- The user must give at least one `--discord-server`, we will not loop
over all the user's servers.
- The *Server* in Discord called *Guild* in the API.
- If the user doesn't give `--discord-channel`, we will scan all the
channels in a server.
- We will scan all messages until `--discord-duration` or
`--discord-messages-count` (the closest one).
- **Threads**: Only *bots* can get all the **threads** from a
**channel**. As we currently use *Personal Access Token*, we can get the
**thread** from the **message** started it.
So we will scan the **threads** that *started* in the time/limit
arguments (as they are messages that scanned), and for each **thread**
we will scan the messages in the time/limit requirements.

### Questions

- It is a little confusing if you give multiple **servers** and multiple
**channels**, because each **channel** relates to its **server**, and
also if you will give **channels** only for one **server**, the other
**server** will not be scanned at all.
Compared to *Confluence*, which has **servers** and **spaces**, we are
not scanning multiple **servers** in one scan.

Waits for #52
I noticed in `pr-validation` we are running our linter twice, and not
running unit-tests at all.
BREAKING CHANGE: the CLI command and arguments was changed

See the discussion:
#20 (comment)

I think we don't need to support running *2ms* for multiple plugins at
once. It is a rare case and it is confusing the command line arguments.

Instead, I'm proposing using *SubCommand* for each plugin.

---------

Co-authored-by: Jossef Harush Kadouri <[email protected]>
- make functions private
- add helpful examples for -h/--help/help

Close #43

```
❯ go run .
2ms Secrets Detection: A tool to detect secrets in public websites and communication services.

Usage:
  2ms [command]

Plugins
  confluence  Scan Confluence server
  discord     Scan Discord server
  repository  Scan local repository

Additional Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command

Flags:
  -h, --help               help for 2ms
      --log-level string   log level (trace, debug, info, warn, error, fatal) (default "info")
      --tags strings       select rules to be applied (default [all])
  -v, --version            version for 2ms

Use "2ms [command] --help" for more information about a command.
```

```
❯ go run . help confluence
Scan Confluence server for sensitive information

Usage:
  2ms confluence --url URL [flags]

Flags:
  -h, --help                 help for confluence
      --history              Scan pages history
      --spaces stringArray   Confluence spaces: The names or IDs of the spaces to scan
      --token string         The Confluence API token for authentication
      --url string           Confluence server URL (example: https://company.atlassian.net/wiki) [required]
      --username string      Confluence user name or email for authentication

Global Flags:
      --log-level string   log level (trace, debug, info, warn, error, fatal) (default "info")
      --tags strings       select rules to be applied (default [all])
```

```
❯ go run . help discord
Scan Discord server for sensitive information.

Usage:
  2ms discord --token TOKEN --server SERVER [flags]

Flags:
      --channel stringArray   Discord channels IDs to scan. If not provided, all channels will be scanned
      --duration duration     The time interval to scan from the current time. For example, 24h for 24 hours or 7d for 7 days. (default 336h0m0s)
  -h, --help                  help for discord
      --messages-count int    The number of messages to scan. If not provided, all messages will be scanned until the fromDate flag value.
      --server stringArray    Discord servers IDs to scan [required]
      --token string          Discord token [required]

Global Flags:
      --log-level string   log level (trace, debug, info, warn, error, fatal) (default "info")
      --tags strings       select rules to be applied (default [all])
```

```
❯ go run . help repository
Scan local repository for sensitive information

Usage:
  2ms repository --path PATH [flags]

Flags:
  -h, --help          help for repository
      --path string   Local repository path [required]

Global Flags:
      --log-level string   log level (trace, debug, info, warn, error, fatal) (default "info")
      --tags strings       select rules to be applied (default [all])
```

---------

Co-authored-by: Jossef Harush Kadouri <[email protected]>
Close: #32

```
❯ go run . help slack
Scan Slack team for sensitive information.

Usage:
  2ms slack --token TOKEN --team TEAM [flags]

Flags:
      --channel stringArray   Slack channels to scan
      --duration duration     Slack backward duration for messages (ex: 24h, 7d, 1M, 1y) (default 336h0m0s)
  -h, --help                  help for slack
      --messages-count int    Slack messages count to scan (0 = all messages)
      --team string           Slack team name or ID [required]
      --token string          Slack token [required]

Global Flags:
      --all                scan all plugins (default true)
      --log-level string   log level (trace, debug, info, warn, error, fatal) (default "info")
      --tags strings       select rules to be applied (default [all])
```

Like in Discord, more knowledge is required to integrate this plugin
into an E2E system. For example, except for retrieving the token (from
**OAuth & Permissions** in the Slack App page), you have to add your app
to each **channel** you want to read.


![image](https://github.com/Checkmarx/2ms/assets/17686879/0db4994a-7304-4e70-a268-fff242f6ca35)

---------

Co-authored-by: Jossef Harush Kadouri <[email protected]>
- split into 2 depends jobs
- publish to Dockerhub

For #77 (not closing until testing it)
- get folders and components
- handle only `component` and `folder` types
- send content to items
- avoid reaching the rate limit

Note we are handling only `folder`s to find sub-folders or `components`,
and reading only the `component` (document) content.

### Rate Limit

The rate limit **per minute** described
[here](https://paligo.net/docs/apidocs/en/index-en.html#UUID-a5b548af-9a37-d305-f5a8-11142d86fe20).
The core lib `time/rate` is limiting the rate **per second**. That's why
I'm starting with open seats for all the requests-per-minute, in case
there are fewer requests than the limit. After the first
too-many-requests error, I'm reducing the available seats to 1, so the
frequency of the requests will not reach the rate limit.

Close #75
All IDs are now usable for the user
That's because the Docker publish step is still in progress, but I don't
want it will block us from creating a new release.
In Jenkins, I sew users hosting their credentials as a `username:token`
base64 encoded secret, so I want to support it as an argument.
- refactor: repository plugin initialization
- change repository to filesystem
- repository plugin - support scanning historical git commits
  Fixes #66
@cx-rogerio-dalot cx-rogerio-dalot force-pushed the AST-99998-architecture-refactor-2 branch from 1fcb373 to 03320ed Compare September 2, 2025 11:29
engine/engine.go Outdated

ScanConfig: engineConfig.ScanConfig,

secretsChan: make(chan *secrets.Secret, 10),
Copy link
Contributor

@cx-leonardo-fontes cx-leonardo-fontes Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason for this to be specifically 10? will all channels from here have the same buffer size? if so, what about a const for this size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time, I just had to put something. I changed all these to runtime.GOMAXPROCS which is roughly the amount of concurrency we have.


func NewChannels(opts ...Option) PluginChannels {
channels := &Channels{
Items: make(chan ISourceItem, 64),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question I did in other comment, is there any reason for the channel buffer size to be exactly this value? and the same for the error channel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my answer above

<!--
Thanks for contributing to 2ms by offering a pull request.
-->

Closes #

**Proposed Changes**

- It introduces a workerpool to optimize the detection which wraps
around [pond](https://github.com/alitto/pond)
- Initializes with a default number of workers calculated by number of
CPUs x2
- Reached that conclusion with the benchmark, see my comment below with
the stats.
- This workerpool can be used for other purposes in next steps of the
pipeline (in future PRs)
- On a future PR we won't wait for the detection to end.
- Fixed some linter issues
<!--
Please describe the big picture of your changes here. If it fixes a bug
or resolves a feature request, be sure to link to that issue.
-->

**Checklist**

- [X] I covered my changes with tests.
- [ ] I Updated the documentation that is affected by my changes:
  - [ ] Change in the CLI arguments
  - [ ] Change in the configuration file

I submit this contribution under the Apache-2.0 license.

---------

Co-authored-by: Rogério Dalot <[email protected]>
Base automatically changed from AST-99998-architecture-refactor to master September 9, 2025 09:57
@cx-rogerio-dalot cx-rogerio-dalot dismissed cx-leonardo-fontes’s stale review September 9, 2025 09:57

The base branch was changed.

Copy link

github-actions bot commented Sep 9, 2025

kics-logo

KICS version: v1.7.13

Category Results
HIGH HIGH 0
MEDIUM MEDIUM 0
LOW LOW 0
INFO INFO 0
TRACE TRACE 0
TOTAL TOTAL 0
Metric Values
Files scanned placeholder 13
Files parsed placeholder 13
Files failed to scan placeholder 0
Total executed queries placeholder 53
Queries failed to execute placeholder 0
Execution time placeholder 1

@cx-rogerio-dalot cx-rogerio-dalot force-pushed the AST-99998-architecture-refactor-2 branch 2 times, most recently from 568e87e to 321e9ce Compare September 11, 2025 16:06
@cx-rogerio-dalot
Copy link
Contributor Author

messed up the history trying to fix commit history because of .2ms.yml, this PR has moved into #327

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.