• 5 Posts
  • 39 Comments
Joined 2 years ago
cake
Cake day: June 12th, 2023

help-circle

  • TLDR: Not a bug, feature; Or works as intended:)

    …but few-to-none of the comments/votes did. Everything since subscribing is entirely in sync.

    That is by design, if every instance automatically synchronized (federated) every post and every comment from every other instance …the whole fediverse would explode?:) well it would at least require a loot more resources hosting any/every instance.

    As for the “loading history”, if you take a true url[1] of a post or comment, insert it into the search bar of your instance, it will load it (and it will be visible in the corresponding community). One problem are votes, afair lemmy does not even offer a mechanism to let other instances see all historical votes, do not confuse this with votes that are already federated, the moment you subscribed is the moment the instance hosting that community started sending everything happening from now on in that community to lemmy.ml (your instance).

    [1] - true url here means from where the resource originates/which instance is hosting that comment/post/community; You can find it as the little fediverse button on each non-local resource (comment/post/community).

    E: I see others beat me to it haha













  • Ah, you are using pretty different deployment then, even the used postgres image is different then the usual deployment ( pgautoupgrade/pgautoupgrade:16-alpine instead of postgres:16-alpine) this might or might not cause differences.

    I would try increasing POSTGRES_POOL_SIZE to 10-20, but I am guessing here, the idea being that lemmy is hammering postgres through the default 5 conns which increases CPU but that is a bit of stretch







  • So, the word here is parallelism. It’s not something specific to python, asyncio in python is just the implementation of asynchronous execution allowing for parallelism.

    Imagine a pizza restaurant that has one cook. This is your typical non-async, non-threading python script - single-threaded.
    The cook checks for new orders, pickups the first one and starts making the pizza one instruction at the time - fetching the dough, waiting for the ham slicer to finish slicing, … eventually putting the unbaked pizza into oven and sitting there waiting for the pizza to bake.
    The cook is rather inefficient here, instead of waiting for the ham slicer and oven to finish it’s job he could be picking up new orders, starting new pizzas and fetching/making other different ingredients.

    This is where asynchronicity comes in as a solution, the cook is your single-thread and the machines would be mechanisms that have to be started but don’t have to be waited on - these are usually various sockets, file buffers (notice these are what your OS can handle for you on the side, asyncIO ).
    So, the cook configures the ham slicer (puts a block of ham in) and starts it - but does not wait for each ham slice to fall out and put it on the pizza. Instead he picks up a new order and goes through the motions until the ham slicer is done (or until he requires the slicer to cut different ingredient, in this case he would have to wait for the ham task to finish first, put …cheese there and switch to finishing the first order with ham).

    With proper asynchronicity your cook can now handle a lot more pizza orders, simply because his time is not spent so much on waiting.
    Making a single pizza is not faster but in-total the cook can handle making more of them in the same time, this is the important bit.


    Coming back to why a async REPL is useful comes simply to how python implements async - with special (“colored”) functions:

    async def prepare_and_bake(pizza):
      await oven.is_empty()  # await - a context switch can occur and python will check if other asynchronous tasks can be continued/finalized
      # so instead of blocking here, waiting for the oven to be empty the cook looks for other tasks to be done
      await oven.bake(pizza)  
      ...
    

    The function prepare_and_bake() is asynchronous function (async def) which makes it special, I would have to dive into Event Loops here to fully explain why async REPL is useful but in short, you can’t call async functions directly to execute them - you have to schedule the func.
    Async REPL is here to help with that, allowing you to do await prepare_and_bake() directly, in the REPL.


    And to give you an example where async does not help, you can’t speed up cutting up onions with a knife, or grating cheese.
    Now, if every ordered pizza required a lot of cheese you might want to employ a secondary cook to preemptively do these tasks (and “buffer” the processed ingredients in a bowl so that your primary cook does not have to always wait for the other cook to start and finish).

    This is called concurrency, multiple tasks that require direct work and can’t be relegated to a machine (OS, or to be precise can’t be just started and awaited upon) are done at the same time.
    In a real example if something requires a lot of computation (calculating something - like getting nth fibonnaci number, applying a function to list with a lot of entries, …) you would want to employ secondary threads or processes so that your main thread does not get blocked.

    To summarize, async/parallelism helps in cases where you can delegate (IO) processing to the OS (usually reading/writing into/out of a buffer) but does not make anything go faster in itself, just more efficient as you don’t have to wait so much which is often a problem in single-threaded applications.

    Hopefully this was somewhat understandable explanation haha. Here is some recommended reading https://realpython.com/async-io-python/

    Final EDIT: Reading it myself few times, a pizza bakery example is not optimal, a better example would have been something where one has to talk with other people but these other people don’t have immediate responses - to better drive home that this is mainly used on Input/Output tasks.