This is an idea I’ve been toying with for a bit. There is a ton of media that includes unimportant information that doesn’t need to be stored pixel perfect. Storing large portions of the image data as text will save substantial amounts of storage, and as the reality of on-device image generation becoming commonplace sets in digital memories will become the main way people capture the world around them. I think this will inevitably be the next form of media capture (photography and video), not replacing other methods/ formats, but I could see things like phone cameras having saving images as digital memories set to default to save on storage.

  • Bloody Harry@feddit.de
    link
    fedilink
    arrow-up
    66
    ·
    1 year ago

    currently, storage space is significantly cheaper than all the cpu power needed to generate the images from a text description. also, what if you actually wanted to view the backgroud of the object? and where’s the advantage besides an at best 40 % increased storage space edficiency? after all, people are taking pictures to actually capture the moment. else they would do voice memos all the time.

    • duncesplayed@lemmy.one
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      2
      ·
      1 year ago

      after all, people are taking pictures to actually capture the moment

      Depending on what you mean by “the moment”, I don’t think that’s really true. Modern cell phone photography doesn’t really give you what the sensors have picked up. You take a picture of your friend with his eyes closed and the phone will change the picture to have his eyes open. You take a blurry picture of the moon and your phone will enhance it to make a better picture of the moon. I mean some people hate it but a lot people do actually like it.

      And they like it because they don’t really take pictures for the purpose of posterity. They don’t take a picture of their friend because they need to look back 20 years from now and remember exactly how that one plastic bag 30m in the distance was crumpled. They take the picture because they want to post to Instagram, get some likes from their friends, and maybe look back 20 years from now to remember the general vibe, and if their phone can “enhance” that for them.

      If people could record a voice memo and have their phone actually make a really decent Instagram post out of it for them, I 1000% believe people would do it instead of taking an actual picture. Posting pictures is more about socializing than it is about posterity.

    • Crow@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      10
      ·
      1 year ago

      Which is why I wanted to include video in my concept because video file sizes are getting out of control.

      • SzethFriendOfNimi@lemmy.world
        link
        fedilink
        arrow-up
        12
        ·
        1 year ago

        As a way to store information it’s really overly complicated and comes with all the downsides of human memory.

        As a way to explain how imperfect human memory is or as a way to add deliberate “memory” decay to an artificial intelligence however it could be useful.

      • stoy@lemmy.zip
        link
        fedilink
        arrow-up
        7
        ·
        1 year ago

        That makes even less sense, the CPU/GPU usage would be insane, and if used in large scale, would quickly get up there with crypto mining in terms of energy use, and that is already a big problem for the environment.

        Storage of large files on the other hand needs relatively little energy to keep on a harddrive.

      • Footnote2669@lemmy.zip
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        Imo video sizes will eventually plateau, there is only so much resolution people actually need. There is so little difference from a distance. From 5m I can’t really tell much difference between 4K and 1080p content on a 50 inch tv. Not to mention resolutions above that

      • steakmeout@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        Are they? Video compresses really well these days. How does replacing real footage with generated content that cannot be accurate better than accurate video?