Strawberry Fields
I was working on getting some work ported over to Strawberry from Graphene-Django, and I suddenly hit a snag. Once I found out what happened to the Strawberry Fields, I was just glad I could solve it at that point.
The premise
So the premise is that I worked on a project that was written with Graphene Django, for Django obviously, that essentially is the Python framework implementation behind a GraphQL schema. So you have a GraphQL endpoint where you send your queries and mutations and it will validate the schema and execute the queries and mutations. The two problems with the framework are it is not fully async
and it is a lot of meta class programming that implies a lot of configuring but not a lot of control over the implementation details.
This meant when we were hitting our performance issues, we tried everything from low-hanging fruit like using Dataloaders, to optimizing certain queries by not relying on the ORM (Graphene Mongo in this case) and just using PyMongo directly.
Still we got stuck again, and the fact we relied heavily on the Promise implementation in Python to match the Promises/A+ from Javascript land did not help actually. Also there tried to improve some things by making things more streamlined in their concurrency, but alas. So the move is to another framework, preferably in Python, to keep the current dev team.
Enter Strawberry
So Strawberry is a nice framework that does both sync
and async
and it is a lot of configuring as well, but also lets you control the implementation details. For instance through things called FieldExtensions
. These can either be run when the schema is first generated (through the apply
method) or when the nodes are being resolved (through either sync
resolve
or async
resolve_async
functions). This is a wonderful way to, through a middleware type approach, have a way to tweak the implementation details.
One of the things that kind of was lacking in the Graphene Django implementation setup was a nice automatic control of what fields were allowed to be given as filters. Ideally it should just be always the fields exposed on the Node itself. That however was not the case, you had to manually make it so. Trying to make the new stack better and fixing that particular nuisance, I made a simple BaseExtension
class:
class BaseExtension(FieldExtension):
def apply(self, field: StrawberryField) -> None:
self.filter_fields = []
resolved_type: Type[WithStrawberryObjectDefinition] = cast(
Type[WithStrawberryObjectDefinition], field.resolve_type()
)
if resolved_type.__strawberry_definition__.specialized_type_var_map:
node = cast(
Type[WithStrawberryObjectDefinition],
resolved_type.__strawberry_definition__.specialized_type_var_map[
"NodeType"
],
)
for f in node.__strawberry_definition__.fields:
field.arguments.append(
StrawberryArgument(
python_name=f.name,
graphql_name=f.name.replace("_", "")
if f.name.startswith("_")
else None,
type_annotation=StrawberryAnnotation(
Optional[f.type]
if not isinstance(f.type, StrawberryOptional)
else Optional[f.type.of_type]
),
description="",
default=strawberry.UNSET,
)
)
self.filter_fields.append(f.name)
All this really does is go over the fields defined and add them all as Arguments
so that you can filter on them and they will be passed along as kwargs
in the resolve
functions.
Perfect.
Snag time
So I was porting the NodeType
s and this project also uses Relay. It is a certain implementation of GraphQL itself. Not very important, except for me to say now I had not made any connections yet. As in one Node –> another Node. Which is quite common in GraphQL and in Relay.
When I made the first connection, the schema would not even generate. I was so frustrated, and nothing worked. I could go from resolver function -> relay.ListConnection[NodeType]
but not from Node -> relay.ListConnection[NodeType]
. It kept complaining about it not being a GraphQLInput Type. I did not want it as an input. I struggled and looked deep into the source code of everything, trying to hack it there. Making it dynamically an input or an output depending on properties, and I suddenly stopped. Since there was no mention of this online whatsoever it had to be a problem I caused and created.
I went to bed, late. Woke up. Paced around a bit. In my head I thought, why is it automatically turning into an argu.....oh I am an idiot.
So I revisited my BaseExtension
field that powered my dynamic argument adding stuff. I tweaked it here and there and the following is the fixed version:
class BaseExtension(FieldExtension):
filter_fields = ["project_id", "change_order_id"]
def apply(self, field: StrawberryField) -> None:
self.filter_fields = ["project_id", "change_order_id"]
resolved_type: Type[WithStrawberryObjectDefinition] = cast(
Type[WithStrawberryObjectDefinition], field.resolve_type()
)
if resolved_type.__strawberry_definition__.specialized_type_var_map:
node = cast(
Type[WithStrawberryObjectDefinition],
resolved_type.__strawberry_definition__.specialized_type_var_map[
"NodeType"
],
)
else:
node = resolved_type
for f in node.__strawberry_definition__.fields:
if inspect.isclass(f.type) and issubclass(
f.type, strawberry.relay.types.ListConnection
):
continue
if isinstance(f.type, StrawberryOptional):
if inspect.isclass(f.type.of_type) and issubclass(
f.type.of_type, strawberry.relay.types.ListConnection
):
continue
field.arguments.append(
StrawberryArgument(
python_name=f.name,
graphql_name=f.name.replace("_", "")
if f.name.startswith("_")
else None,
type_annotation=StrawberryAnnotation(
Optional[f.type]
if not isinstance(f.type, StrawberryOptional)
else Optional[f.type.of_type]
),
description="",
default=strawberry.UNSET,
)
)
self.filter_fields.append(f.name)
Essentially what I needed to do was check if the type
or of_type
is a class
. If it is check if it is a relay.ListConnection
type class and then exclude it from the argument generation. All worked right after this.
Conclusion
I really like this framework. It gives me insight into how they operate and why sometimes a particular query is slow, and they give you the space to fix it. For example I already fixed the fact that we can load all the necessary subparts in one go from a node with the Dataloaders. That was not possible before. However it was still as slow as the old stack, because each node on it's own tried to create this new relay.ListConnection
for one Edge
essentially.
We already have all the instances needed to make all the edges when doing the Dataloader logic, so implement in that particular spot also the creation of all the edges in one go. Then have a simple mapping of node.id -> Edge
and you are done. This sped up things by quite a significant margin.
Something the old stack could not really do. It had no real way of giving you the same tools to do the same thing.