Skip to content

Commit d80f66f

Browse files
committed
Merge branch 'doc-fixes' of https://github.com/graphistry/pygraphistry into cleanup
2 parents a272ea3 + 0b4702a commit d80f66f

File tree

15 files changed

+410
-230
lines changed

15 files changed

+410
-230
lines changed

docs/source/graphistry.rst

Lines changed: 50 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,81 @@
1-
graphistry package
1+
Layout & Plugins
22
==================
33
.. toctree::
44
:maxdepth: 3
55

6-
graphistry.compute
6+
77
graphistry.layout
88
graphistry.plugins
99
graphistry.plugins_types
1010

1111

12-
graphistry.plotter module
13-
-------------------------
12+
Plotter Module
13+
==================
1414

15-
.. automodule:: graphistry.plotter
15+
.. automodule:: graphistry.PlotterBase
1616
:members:
1717
:undoc-members:
1818
:show-inheritance:
1919

20-
graphistry.pygraphistry module
21-
------------------------------
20+
Pygraphistry Module
21+
==================
2222

2323
.. automodule:: graphistry.pygraphistry
2424
:members:
2525
:undoc-members:
2626
:show-inheritance:
2727

28-
graphistry.arrow_uploader module
29-
--------------------------------
28+
Featurize
29+
==================
30+
.. automodule:: graphistry.feature_utils
31+
:members:
32+
:undoc-members:
33+
:show-inheritance:
34+
35+
36+
UMAP
37+
==================
38+
.. automodule:: graphistry.umap_utils
39+
:members:
40+
:undoc-members:
41+
:show-inheritance:
42+
43+
44+
Semantic Search
45+
==================
46+
.. automodule:: graphistry.text_utils
47+
:members:
48+
:undoc-members:
49+
:show-inheritance:
50+
51+
DBScan
52+
==================
53+
.. automodule:: graphistry.compute.cluster
54+
:members:
55+
:undoc-members:
56+
:show-inheritance:
57+
58+
Arrow uploader Module
59+
==================
3060

3161
.. automodule:: graphistry.arrow_uploader
3262
:members:
3363
:undoc-members:
3464
:show-inheritance:
3565

36-
graphistry.ArrowFileUploader module
37-
-----------------------------------
66+
Arrow File Uploader Module
67+
==================
3868

3969
.. automodule:: graphistry.ArrowFileUploader
4070
:members:
4171
:undoc-members:
4272
:show-inheritance:
73+
74+
Versioneer
75+
==================
76+
77+
.. automodule:: graphistry._version
78+
:members:
79+
:undoc-members:
80+
:show-inheritance:
81+

docs/source/index.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
1-
PyGraphistry's documentation (|version|)
1+
PyGraphistry[ai]'s documentation
22
========================================
33

4-
Quickstart:
5-
`Read our tutorial <https://github.com/graphistry/pygraphistry/blob/master/README.md>`_
4+
.. Quickstart:
5+
.. `Read our tutorial <https://github.com/graphistry/pygraphistry/blob/master/README.md>`_
6+
7+
PyGraphistry is a Python visual graph AI library to extract, transform, analyze, model, and visualize big graphs, and especially alongside Graphistry end-to-end GPU server sessions. Installing optional graphistry[ai] dependencies adds graph autoML, including automatic feature engineering, UMAP, and graph neural net support. Combined, PyGraphistry reduces your time to graph for going from raw data to visualizations and AI models down to three lines of code.
8+
Here in our docstrings you can find useful packages, modules, and commands to maximize your graph AI experience with PyGraphistry. In the navbar you can find an overview of all the packages and modules we provided and a few useful highlighted ones as well. You can search for them on our Search page. For a full tutorial, refer to our `PyGraphistry <https://github.com/graphistry/pygraphistry/>`_ repo.
69

710
.. toctree::
811
:maxdepth: 3

docs/source/modules.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
doc
2-
===
1+
.. doc
2+
.. ===
33
4-
.. toctree::
5-
:maxdepth: 4
6-
:caption: Contents:
4+
.. .. toctree::
5+
.. :maxdepth: 4
6+
.. :caption: Contents:
77
8-
versioneer
8+
.. versioneer
99

docs/source/versioneer.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
versioneer module
2-
=================
1+
.. versioneer module
2+
.. =================

graphistry/PlotterBase.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ def style(self, fg=None, bg=None, page=None, logo=None):
300300
:param fg: Dictionary {'blendMode': str} of any valid CSS blend mode
301301
:type fg: dict
302302
303-
:param bg: Nested dictionary of page background properties. {'color': str, 'gradient': {'kind': str, 'position': str, 'stops': list }, 'image': { 'url': str, 'width': int, 'height': int, 'blendMode': str }
303+
:param bg: Nested dictionary of page background properties. { 'color': str, 'gradient': {'kind': str, 'position': str, 'stops': list }, 'image': { 'url': str, 'width': int, 'height': int, 'blendMode': str }
304304
:type bg: dict
305305
306306
:param logo: Nested dictionary of logo properties. { 'url': str, 'autoInvert': bool, 'position': str, 'dimensions': { 'maxWidth': int, 'maxHeight': int }, 'crop': { 'top': int, 'left': int, 'bottom': int, 'right': int }, 'padding': { 'top': int, 'left': int, 'bottom': int, 'right': int}, 'style': str}
@@ -314,15 +314,18 @@ def style(self, fg=None, bg=None, page=None, logo=None):
314314
315315
**Example: Chained merge - results in url and blendMode being set, while color is dropped**
316316
::
317+
317318
g2 = g.style(bg={'color': 'black'}, fg={'blendMode': 'screen'})
318319
g3 = g2.style(bg={'image': {'url': 'http://site.com/watermark.png'}})
319320
320321
**Example: Gradient background**
321322
::
323+
322324
g.style(bg={'gradient': {'kind': 'linear', 'position': 45, 'stops': [['rgb(0,0,0)', '0%'], ['rgb(255,255,255)', '100%']]}})
323325
324326
**Example: Page settings**
325327
::
328+
326329
g.style(page={'title': 'Site - {{ name }}', 'favicon': 'http://site.com/logo.ico'})
327330
328331
"""
@@ -850,13 +853,14 @@ def bind(self, source=None, destination=None, node=None, edge=None,
850853
:param edge: Attribute containing an edge's ID
851854
:type edge: str
852855
853-
:param edge_title: Attribute overriding edge's minimized label text. By default, the edge source and destination is used.
856+
:param edge_title: Attribute overriding edge's minimized label text.
857+
By default, the edge source and destination is used.
854858
:type edge_title: str
855859
856860
:param edge_label: Attribute overriding edge's expanded label text. By default, scrollable list of attribute/value mappings.
857861
:type edge_label: str
858862
859-
:param edge_color: Attribute overriding edge's color. rgba (int64) or int32 palette index, see palette definitions <https://graphistry.github.io/docs/legacy/api/0.9.2/api.html#extendedpalette>`_ for values. Based on Color Brewer.
863+
:param edge_color: Attribute overriding edge's color. rgba (int64) or int32 palette index, see `palette <https://graphistry.github.io/docs/legacy/api/0.9.2/api.html#extendedpalette>`_ definitions for values. Based on Color Brewer.
860864
:type edge_color: str
861865
862866
:param edge_source_color: Attribute overriding edge's source color if no edge_color, as an rgba int64 value.
@@ -874,7 +878,7 @@ def bind(self, source=None, destination=None, node=None, edge=None,
874878
:param point_label: Attribute overriding node's expanded label text. By default, scrollable list of attribute/value mappings.
875879
:type point_label: str
876880
877-
:param point_color: Attribute overriding node's color.rgba (int64) or int32 palette index, see palette definitions <https://graphistry.github.io/docs/legacy/api/0.9.2/api.html#extendedpalette>`_ for values. Based on Color Brewer.
881+
:param point_color: Attribute overriding node's color.rgba (int64) or int32 palette index, see `palette <https://graphistry.github.io/docs/legacy/api/0.9.2/api.html#extendedpalette>`_ definitions for values. Based on Color Brewer.
878882
:type point_color: str
879883
880884
:param point_size: Attribute overriding node's size. By default, uses the node degree. The visualization will normalize point sizes and adjust dynamically using semantic zoom.
@@ -1007,6 +1011,7 @@ def nodes(self, nodes: Union[Callable, Any], node=None, *args, **kwargs) -> Plot
10071011
10081012
**Example**
10091013
::
1014+
10101015
import graphistry
10111016
10121017
def sample_nodes(g, n):
@@ -1106,6 +1111,7 @@ def edges(self, edges: Union[Callable, Any], source=None, destination=None, edge
11061111
11071112
**Example**
11081113
::
1114+
11091115
import graphistry
11101116
11111117
def sample_edges(g, n):

graphistry/compute/cluster.py

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -71,11 +71,11 @@ def get_model_matrix(g, kind: str, cols: Optional[Union[List, str]], umap, targe
7171
Allows for a single function to get the model matrix for both nodes and edges as well as targets, embeddings, and features
7272
7373
Args:
74-
g: graphistry graph
75-
kind: 'nodes' or 'edges'
76-
cols: list of columns to use for clustering given `g.featurize` has been run
77-
umap: whether to use UMAP embeddings or features dataframe
78-
target: whether to use the target dataframe or features dataframe
74+
:g: graphistry graph
75+
:kind: 'nodes' or 'edges'
76+
:cols: list of columns to use for clustering given `g.featurize` has been run
77+
:umap: whether to use UMAP embeddings or features dataframe
78+
:target: whether to use the target dataframe or features dataframe
7979
8080
Returns:
8181
pd.DataFrame: dataframe of model matrix given the inputs
@@ -99,11 +99,11 @@ def dbscan_fit(g: Any, dbscan: Any, kind: str = "nodes", cols: Optional[Union[Li
9999
Fits clustering on UMAP embeddings if umap is True, otherwise on the features dataframe
100100
or target dataframe if target is True.
101101
102-
args:
103-
g: graphistry graph
104-
kind: 'nodes' or 'edges'
105-
cols: list of columns to use for clustering given `g.featurize` has been run
106-
use_umap_embedding: whether to use UMAP embeddings or features dataframe for clustering (default: True)
102+
Args:
103+
:g: graphistry graph
104+
:kind: 'nodes' or 'edges'
105+
:cols: list of columns to use for clustering given `g.featurize` has been run
106+
:use_umap_embedding: whether to use UMAP embeddings or features dataframe for clustering (default: True)
107107
"""
108108
X = get_model_matrix(g, kind, cols, use_umap_embedding, target)
109109

@@ -212,6 +212,8 @@ def dbscan(
212212
"""DBSCAN clustering on cpu or gpu infered automatically. Adds a `_dbscan` column to nodes or edges.
213213
214214
Examples:
215+
::
216+
215217
g = graphistry.edges(edf, 'src', 'dst').nodes(ndf, 'node')
216218
217219
# cluster by UMAP embeddings
@@ -244,14 +246,14 @@ def dbscan(
244246
https://github.com/graphistry/pygraphistry/blob/master/demos/ai/cyber/cyber-redteam-umap-demo.ipynb
245247
246248
Args:
247-
min_dist float: The maximum distance between two samples for them to be considered as in the same neighborhood.
248-
kind str: 'nodes' or 'edges'
249-
cols: list of columns to use for clustering given `g.featurize` has been run, nice way to slice features or targets by
249+
:min_dist float: The maximum distance between two samples for them to be considered as in the same neighborhood.
250+
:kind str: 'nodes' or 'edges'
251+
:cols: list of columns to use for clustering given `g.featurize` has been run, nice way to slice features or targets by
250252
fragments of interest, e.g. ['ip_172', 'location', 'ssh', 'warnings']
251-
fit_umap_embedding bool: whether to use UMAP embeddings or features dataframe to cluster DBSCAN
252-
min_samples: The number of samples in a neighborhood for a point to be considered as a core point.
253+
:fit_umap_embedding bool: whether to use UMAP embeddings or features dataframe to cluster DBSCAN
254+
:min_samples: The number of samples in a neighborhood for a point to be considered as a core point.
253255
This includes the point itself.
254-
target: whether to use the target column as the clustering feature
256+
:target: whether to use the target column as the clustering feature
255257
256258
"""
257259

@@ -333,43 +335,51 @@ def transform_dbscan(
333335
Graph nodes | edges will be colored by '_dbscan' column.
334336
335337
Examples:
338+
::
339+
336340
fit:
337341
g = graphistry.edges(edf, 'src', 'dst').nodes(ndf, 'node')
338342
g2 = g.featurize().dbscan()
339343
340344
predict:
345+
::
346+
341347
emb, X, _, ndf = g2.transform_dbscan(ndf, return_graph=False)
342348
# or
343349
g3 = g2.transform_dbscan(ndf, return_graph=True)
344350
g3.plot()
345351
346352
likewise for umap:
353+
::
354+
347355
fit:
348356
g = graphistry.edges(edf, 'src', 'dst').nodes(ndf, 'node')
349357
g2 = g.umap(X=.., y=..).dbscan()
350358
351359
predict:
360+
::
361+
352362
emb, X, y, ndf = g2.transform_dbscan(ndf, ndf, return_graph=False)
353363
# or
354364
g3 = g2.transform_dbscan(ndf, ndf, return_graph=True)
355365
g3.plot()
356366
357367
358-
args:
359-
df: dataframe to transform
360-
y: optional labels dataframe
361-
min_dist: The maximum distance between two samples for them to be considered as in the same neighborhood.
368+
Args:
369+
:df: dataframe to transform
370+
:y: optional labels dataframe
371+
:min_dist: The maximum distance between two samples for them to be considered as in the same neighborhood.
362372
smaller values will result in less edges between the minibatch and the original graph.
363373
Default 'auto', infers min_dist from the mean distance and std of new points to the original graph
364-
fit_umap_embedding: whether to use UMAP embeddings or features dataframe when inferring edges between
374+
:fit_umap_embedding: whether to use UMAP embeddings or features dataframe when inferring edges between
365375
the minibatch and the original graph. Default False, uses the features dataframe
366-
sample: number of samples to use when inferring edges between the minibatch and the original graph,
376+
:sample: number of samples to use when inferring edges between the minibatch and the original graph,
367377
if None, will only use closest point to the minibatch. If greater than 0, will sample the closest `sample` points
368378
in existing graph to pull in more edges. Default None
369-
kind: 'nodes' or 'edges'
370-
return_graph: whether to return a graph or the (emb, X, y, minibatch df enriched with DBSCAN labels), default True
379+
:kind: 'nodes' or 'edges'
380+
:return_graph: whether to return a graph or the (emb, X, y, minibatch df enriched with DBSCAN labels), default True
371381
infered graph supports kind='nodes' only.
372-
verbose: whether to print out progress, default False
382+
:verbose: whether to print out progress, default False
373383
374384
"""
375385
emb, X, y, df = self._transform_dbscan(df, y, kind=kind, verbose=verbose)

0 commit comments

Comments
 (0)