Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpacket2 with VU support #149

Merged
merged 40 commits into from
Nov 24, 2020
Merged

libpacket2 with VU support #149

merged 40 commits into from
Nov 24, 2020

Conversation

h4570
Copy link
Contributor

@h4570 h4570 commented Nov 7, 2020

Hi guys.

After tons of hours that I've spent on PS2, I'm ready to present for you VU1 rendering support in PS2SDK which is the most optimal way to 3D rendering on PS2, which has been used in most games.

Video:
https://www.youtube.com/watch?v=oupamAeUvHs

It was tested on PCSX2 and hardware (ps2link).

before-after
before-after

I'm not a experienced C/C++ developer, so I will be very grateful for any suggestions and must-do, that can be make.

Features:

  • VU1 micro program uploading
  • VU1 upload data builder
  • VU1 rendering micro program which I've developed for my wannabe PS2 engine project
    • Vertex transformation
    • Clipping
    • Scaling
    • Perspective divide etc.
  • VU1 double buffering (TOP register)
  • Sample, which is drawing 64 boxes on screen at the same time.

Thanks!

@terremoth
Copy link

Very nice!

@HowlingWolfHWC
Copy link

Very good job!

@pedroduarte0
Copy link

pedroduarte0 commented Nov 8, 2020

Great, well done!

@rickgaiser
Copy link
Member

Congratulations on your achievement. The VU's make the PS2 one of the most interesting platforms of it's time, but also one of the most difficult to program.

However, the code you're proposing is in it's current state not suitable for incusion in ps2sdk. It's a good 'proof-of-concept' or 'hello world', but for inclusion in ps2sdk I think it's not suitable. I'll try to give my thoughts as to why:

The heart of the PS2 is the DMA controller. Creating DMA packets is essential, and I think it's an area where the current ps2sdk lacks the most. The "libpacket" library in ee/packet should be a very important library. A part of this PR should probably try to improve libpacket, or create a new super-libpacket, so that it's more easy to manage multiple packets, link them together and send them (using libdma) to the vu1/gs/etc...
Since packet support in ps2sdk is so bad, you've added your own vu1 specific packet library into libdraw. Your library is going the right direction, but it's not "object oriented", for instance:

vu1_create_dyn_list();
vu1_dyn_add_64(1);
vu1_dyn_add_64(2);
vu1_dyn_add_64(3);

vu1_send();

Allows for only 1 "dyn_list" at a time. A more "C" object oriented approach would be to have the user manage multiple packets like so:

packet1 = packet_create(...);
packet_add_64(packet1, 1);
packet_add_64(packet1, 2);
packet_add_64(packet1, 3);

dma_send(vu1, packet1);

These could all be in libpacket and libdma I think? There's a lot more packet libraries for the ps2 though. Like in gsKit, it's managing it's own double buffered packets, just like what you've created. But a far more powerfull implementation (in C++ by sony!) is this one:
https://gitlab.com/ps2max/ps2stuff/-/blob/master/include/ps2s/packet.h

Something like this in "C" would be a perfect "libpacket" for ps2sdk I think.

I know this is asking a lot, but I'd rather have the current crappy packet library in ps2sdk, than lots of packet libraries all serving their own purpose.

2 years ago I tried porting libstuff and libgl to ps2sdk. They where both created by sony and are very efficient. They allow the application to use the OpenGL 1.2 interface, and the library takes care of the rest. It has multiple VU1 programs, each one highly optimized for a specific purpose. Sony originally developed this library for their own SDK, and for ps2linux. Becouse of ps2linux it was release opensource. It's tripple-A-game-grade code, so take a good look ;).
I've managed to get some samples running, but there where still some bugs to be sorted out, partly becouse C++ support was broken at the time. My latest attempts can be found here:
https://gitlab.com/ps2max/ps2stuff
https://gitlab.com/ps2max/ps2gl

Would it be an idea to try to fix these 2 libraries and add them as separate git repositories to ps2dev? The OpenGL 1.2 interface would be a lot more easy and full of features for Tyra and others to implement, and you would probably get awesome performance ;)

@h4570
Copy link
Contributor Author

h4570 commented Nov 11, 2020

Hi Rick.

Thank you for your response. Right, your answer allowed me to better understand the current situation.

I will try to create a second version of libpacket based on the code you sent me. Be warned that this may take a while - I need to study the EE / DMA documentation thoroughly.
As for "object-oriented" - I don't know how I could not have thought about it :D

@h4570 h4570 changed the title Hello VU1! VU1 and libpacket2 Nov 19, 2020
@h4570
Copy link
Contributor Author

h4570 commented Nov 19, 2020

And on the third day Rick said "remove this vu1.c.., just create better libpacket!", and .... here it is "libpacket2"

Hi guys!

  • Moved vu1.c from libdraw to vu1 sample
  • Created libpacket2 with chain & vif support
  • Added wrapper function in libdma, for libpacket2
  • Refactored vu1.c and whole sample to use new libpacket2

I will be greateful for any tips!

Everything was tested on emulator&hardware.
Thanks.

Copy link
Member

@rickgaiser rickgaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work in such a short time!

With some minor changes I'd love to merge this into ps2sdk.

ee/dma/src/dma.c Outdated Show resolved Hide resolved
ee/draw/include/draw3d.h Show resolved Hide resolved
ee/draw/samples/vu1/Makefile.sample Show resolved Hide resolved
ee/draw/samples/vu1/main.c Outdated Show resolved Hide resolved
ee/draw/samples/vu1/vu1.c Outdated Show resolved Hide resolved
ee/draw/samples/vu1/vu1.c Outdated Show resolved Hide resolved
ee/draw/samples/vu1/vu1.c Outdated Show resolved Hide resolved
ee/packet2/include/packet2.h Outdated Show resolved Hide resolved
ee/packet2/src/erl-support.c Show resolved Hide resolved
ee/draw/samples/vu1/vu1.c Outdated Show resolved Hide resolved
@h4570
Copy link
Contributor Author

h4570 commented Nov 20, 2020

Ok, thats all :)

@rickgaiser
Copy link
Member

Amazing!

I've tried to summarize the libraries and API's you're adding:

/* libpacket2: packet2 API */
packet2_t *packet2_create_normal(u16 qwords, enum Packet2Type type);
packet2_t *packet2_create_chain(u16 qwords, enum Packet2Type type, u8 tte);
packet2_t *packet2_create_from(qword_t *base, qword_t *next, u16 qwords, enum Packet2Type type, enum Packet2Mode mode);
void packet2_free(packet2_t *packet2);
void packet2_reset(packet2_t *packet2, u8 clear_mem);
void packet2_add(packet2_t *a, packet2_t *b);

Instead of packet2_create_normal and packet2_create_chain, I would prefer only packet2_create with 1 extra parameter: enum Packet2Mode mode. Just like packet2_create_from is also only a single function.


/* libpacket2: packet2 API - inline */
static inline void packet2_update(packet2_t *packet2, qword_t *qw) { packet2->next = qw; }
static inline void packet2_add_u128(packet2_t *packet2, const u128 val) { *((u128 *)packet2->next)++ = val; }
static inline void packet2_add_s128(packet2_t *packet2, const s128 val) { *((s128 *)packet2->next)++ = val; }
static inline void packet2_add_u64(packet2_t *packet2, const u64 val) { *((u64 *)packet2->next)++ = val; }
static inline void packet2_add_s64(packet2_t *packet2, const s64 val) { *((s64 *)packet2->next)++ = val; }
static inline void packet2_add_u32(packet2_t *packet2, const u32 val) { *((u32 *)packet2->next)++ = val; }
static inline void packet2_add_s32(packet2_t *packet2, const s32 val) { *((s32 *)packet2->next)++ = val; }
static inline void packet2_add_float(packet2_t *packet2, const float val) { *((float *)packet2->next)++ = val; }
static inline u32 packet2_get_qw_count(packet2_t *packet2) { return ((u32)packet2->next - (u32)packet2->base) >> 4; }
static inline u8 packet2_doesnt_have_even_number_of_quads(packet2_t *packet2) { return ((u32)packet2->next & 0xF) != 0; }
static inline u8 packet2_is_dma_tag_opened(packet2_t *packet2) { return packet2->tag_opened_at != NULL; }
static inline u8 packet2_is_vif_code_opened(packet2_t *packet2) { return packet2->vif_code_opened_at != NULL; }
static inline void packet2_align_to_qword(packet2_t *packet2)
/* libpacket2: packet2 chain API - inline */
static inline void packet2_chain_set_dma_tag(dma_tag_t *tag, u32 qwc, u32 pce, u32 id, u8 irq, const u128 *addr, u8 spr)
static inline void packet2_chain_add_dma_tag(packet2_t *packet2, u32 qwc, u32 pce, enum DmaTagType id, u8 irq, const u128 *addr, u8 spr)
static inline void packet2_chain_close_tag(packet2_t *packet2)
static inline void packet2_chain_open_cnt(packet2_t *packet2, u8 irq, u32 pce, u8 spr)
static inline void packet2_chain_open_end(packet2_t *packet2, u8 irq, u32 pce)
static inline void packet2_chain_ref(packet2_t *packet2, const void *ref_data, u32 qw_length, u8 irq, u8 spr, u32 pce)
static inline void packet2_chain_next(packet2_t *packet2, const dma_tag_t *next_tag, u8 irq, u8 spr, u32 pce)
static inline void packet2_chain_refs(packet2_t *packet2, const void *ref_data, u32 qw_length, u8 irq, u8 spr, u32 pce)
static inline void packet2_chain_refe(packet2_t *packet2, const void *ref_data, u32 qw_length, u8 irq, u8 spr, u32 pce)
static inline void packet2_chain_call(packet2_t *packet2, const void *next_tag, u8 irq, u8 spr, u32 pce)
static inline void packet2_chain_ret(packet2_t *packet2, u8 irq, u32 pce)

Great!


/* libpacket2: packet2 VIF chain API - inline */
static inline void packet2_vif_open_unpack(packet2_t *packet2, enum UnpackMode mode, u32 vuAddr, u8 dblBuffered, u8 masked, u8 usigned, u8 irq)
static inline void packet2_vif_close_unpack(packet2_t *packet2, u32 unpack_num)
static inline void packet2_vif_open_direct(packet2_t *packet2, u8 irq)
static inline void packet2_vif_close_direct_manual(packet2_t *packet2, u32 qwords)
static inline void packet2_vif_close_direct_auto(packet2_t *packet2)
static inline void packet2_vif_nop(packet2_t *packet2, u8 irq)
static inline void packet2_vif_mpg(packet2_t *packet2, u32 num, u32 addr, u8 irq)
static inline void packet2_vif_stcycl(packet2_t *packet2, u32 wl, u32 cl, u8 irq)
static inline void packet2_vif_offset(packet2_t *packet2, u32 offset, u8 irq)
static inline void packet2_vif_base(packet2_t *packet2, u32 base, u8 irq)
static inline void packet2_vif_flush(packet2_t *packet2, u8 irq)
static inline void packet2_vif_mscal(packet2_t *packet2, u32 addr, u8 irq)
static inline void packet2_vif_mscnt(packet2_t *packet2, u8 irq)
static inline void packet2_vif_itop(packet2_t *packet2, u32 itops, u8 irq)
static inline void packet2_vif_stmod(packet2_t *packet2, u32 mode, u8 irq)
static inline void packet2_vif_mskpath3(packet2_t *packet2, u32 mask, u8 irq)
static inline void packet2_vif_mark(packet2_t *packet2, u32 value, u8 irq)
static inline void packet2_vif_flushe(packet2_t *packet2, u8 irq)
static inline void packet2_vif_flusha(packet2_t *packet2, u8 irq)
static inline void packet2_vif_mscalf(packet2_t *packet2, u32 addr, u8 irq)
static inline void packet2_vif_stmask(packet2_t *packet2, Mask mask, u8 irq)
static inline void packet2_vif_strow(packet2_t *packet2, const void *row_arr, u8 irq)
static inline void packet2_vif_stcol(packet2_t *packet2, const void *col_arr, u8 irq)

Great!


/* libvu: packet2 VU API - inline */
static inline void vu_add_double_buffer_settings(packet2_t *packet2, u16 base, u16 offset)
static inline void vu_add_unpack_data(packet2_t *packet2, u32 t_dest_address, void *t_data, u32 t_size, u8 t_use_top)
static inline void vu_add_continue_program(packet2_t *packet2)
static inline void vu_add_start_program(packet2_t *packet2, u32 addr)
static inline void vu_add_end_tag(packet2_t *packet2)
static inline void vu_add_flush(packet2_t *packet2)
static inline void vu_open_unpack(packet2_t *packet2)
static inline void vu_close_unpack(packet2_t *packet2)
static inline void vu_unpack_add_u128(packet2_t *packet2, u128 v)
static inline void vu_unpack_add_s128(packet2_t *packet2, u128 v)
static inline void vu_unpack_add_u64(packet2_t *packet2, u64 v)
static inline void vu_unpack_add_2x_s64(packet2_t *packet2, s64 v1, s64 v2)
static inline void vu_unpack_add_s64(packet2_t *packet2, u64 v)
static inline void vu_unpack_add_u32(packet2_t *packet2, u32 v)
static inline void vu_unpack_add_s32(packet2_t *packet2, u32 v)
static inline void vu_unpack_add_float(packet2_t *packet2, float v)
static inline void vu_unpack_add_set(packet2_t *packet2, u32 loops_count)
static inline void vu_unpack_add_lod(packet2_t *packet2, lod_t *lod)
static inline void vu_unpack_add_texbuff_clut(packet2_t *packet2, texbuffer_t *texbuff, clutbuffer_t *clut)
static inline void vu_unpack_add_draw_finish_giftag(packet2_t *packet2)
static inline void vu_unpack_add_prim_giftag(packet2_t *packet2, prim_t *prim, u32 loops_count, u32 nreg, u8 nreg_count, u8 context)

I guess this is my mistake, sorry. All these functions have packet2_t *packet2 as the first parameter. In C++ terms this is not a new class vu, but it's actually a subclass of packet2_vif, right?
Change all function names from "vu_*" to "packet2_vu_*" and add them to libpacket. Then it's all in 1 libpacket2 library. All functions needed to add data to a packet2_t.


void vu_upload_program(u32 t_dest, u32 *t_start, u32 *t_end, int dma_channel);

Change this into somthing like:

void packet2_vu_upload_program(packet2_t *packet2, u32 t_dest, u32 *t_start, u32 *t_end, int dma_channel);

Again, only adding data to packet2_t. Then have the user create and transfer the packet. This also removes the dependency to libdma.

So as a user of the library I can create a chained packet that first transfers the code, and then transfers the data. All in a single chained packet with a single dma transfer.

@h4570 h4570 changed the title libvu and libpacket2 libpacket2 Nov 21, 2020
@h4570 h4570 changed the title libpacket2 libpacket2 with VU support Nov 21, 2020
@h4570
Copy link
Contributor Author

h4570 commented Nov 21, 2020

Done :)

void packet2_vu_add_micro_program(packet2_t *packet2, u32 dest, u32 *start, u32 *end);
u32 packet2_vu_count_program_instructions(u32 *start, u32 *end);
static inline u32 packet2_vu_get_packet_size_for_program(u32 *start, u32 *end);

@rickgaiser
Copy link
Member

Great!

u32 packet2_vu_count_program_instructions(u32 *start, u32 *end);
static inline u32 packet2_vu_get_packet_size_for_program(u32 *start, u32 *end);

These 2 functions don't belong to the "packet2_*" family of functions, becouse they don't do anything with the packet2_t class/struct/type. They are helper functions. I also don't know what else to call them, so just leave it as is I guess.

If there's no comments/reviews/objections from others I'll squash-merge this PR tomorrow.
Can't wait to play with this new library ;).

@rickgaiser
Copy link
Member

I've been experimenting with libpacket2 to see what the library would be like from a user point of view. There's some things missing or not working as easy as I was hoping. I've created this "pseudo code" as an example of how I think the inner loop of the example should be able to work:

for (j = 0; j < 8; j++)
{
    pkt_main = vif_packets[context];

    // Reset the main chaining packet
    packet2_reset(pkt_main, 0);

    // Calculate local screen position
    c_zbyszek_position[1] = j * 40.0F;
    create_local_world(local_world, c_zbyszek_position, object_rotation);
    create_world_view(world_view, camera_position, camera_rotation);
    create_local_screen(local_screen, local_world, world_view, view_screen);

    // Add local screen position into packet (DMA CNT mode)
    packet2_chain_open_cnt(pkt_main, 0, 0, 0);
    {
        packet2_vif_stcycl(pkt_main, 0, 0x0101, 0);
        packet2_vif_open_unpack(pkt_main, P2_UNPACK_V4_32, ...);
        {
            packet2_add_data(pkt_main, &local_screen, 8);
        }
        packet2_vif_close_unpack(pkt_main, t_size);
    }
    packet2_chain_close_tag(pkt_main);

    // Add cube to packet (DMA REF mode)
    // The cube packet is generated outside of the loop, it's only referenced here
    packet2_chain_ref(pkt_main, pkt_cube_object);

    // Start VU1 program and close DMA chain (DMA END mode)
    packet2_chain_open_end(pkt_main, 0, 0, 0);
    {
        packet2_vif_flush(pkt_main, 0);
        packet2_vif_mscal(pkt_main, 0, 0);
    }
    packet2_chain_close_tag(pkt_main);

    // Wait for previous transfer to finish
    dma_channel_wait(DMA_CHANNEL_VIF1, 0);
    // Start transfer of cube to VU1
    dma_channel_send_packet2(pkt_main, DMA_CHANNEL_VIF1, 1);

    // Switch packet, so we can proceed during DMA transfer
    context = !context;
}

NOTE1: I have not used the "vu" part of the library, as I feel constraint in possibilities and efficiency when using it.
NOTE2: I've made the program (this loop) in control of ALL packets and ALL DMA transfers, reduced to 1 transfer.
NOTE3: pkt_cube_object can be created outside of the loop, making the program run faster
NOTE4: It should be possible to chain packet_t to other packet_t, like in the pseudo code.
NOTE5: I'm waiting for the DMA transfer to finish BEFORE sending the next packet instead of AFTER it. This should give a little performance boost.

What do you think? The library is already usefull as it is, but I think it can be made more simple and powerfull. Should I pull it and you/I/we change things later with PR's when needed?

One other thing: the packet2_vu_unpack_add_* functions are confusing and should not be needed. They are the same as the packet2_add_* functions, except for increasing a byte counter. They don't do anything with vu_unpack as the name suggests. packet2_add_* functions can also be changed to increase 1 or more byte/qwc counters.

@h4570
Copy link
Contributor Author

h4570 commented Nov 23, 2020

Hi Rick.

I played a little with _vu, and changed it into _utils to emphasize that this is just an addition. So necessary functions, like "upload_micro_program()" was moved out.

vu_add() functions was removed.

On the client side, packets can be merged via memcpy() (packet2_add()) and reference. Below is the example of that.

image
image

To be honest, I don't want to make sample too complex. I refactored sample to make 1 dma send per cube, but did it with packet2_add().

Also added print_qws(), print_data() which can be helpful (and was!) with debugging.

I think, that is good idea, to add new features in next PRs. This is the big one! :P

Thanks!

@h4570
Copy link
Contributor Author

h4570 commented Nov 23, 2020

The reason that vu1.c had send_matrix() in separate dma packet is the real world scenario.

In real world, game is not rendering cube with several vertices, but meshes with 10-20k vertices. In Tyra I'm sending view/proj matrix once per mesh, and after it I'm splitting big 3D mesh into many small packets and sending 2-3 of them at once. Sony recommended about 60 verts at once/buffer, because of VU mem limit.

But of course, view/proj matrix can be attached to first small packet of 3D mesh, it is faster.

@rickgaiser rickgaiser merged commit f870e2d into ps2dev:master Nov 24, 2020
@h4570 h4570 deleted the feature/vu1-sample branch November 25, 2020 15:40
@morenoruiz
Copy link

Gracias hijo..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants